Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with URL Escaping #1965

Closed
simonspa opened this issue Jan 31, 2023 · 8 comments
Closed

Issue with URL Escaping #1965

simonspa opened this issue Jan 31, 2023 · 8 comments
Labels

Comments

@simonspa
Copy link

Describe the bug
Yesterday I started to see the log message below in the NC logs for every cron background job iteration run on the server. I am not aware of any changes to the system (i.e. no updates or similar) so I would expect this to be from some user data - but the log message doesn't say mutch.

To be hones, I'm not even sure it is an issue with Bookmarks, but that's most prominently show in the error log.

Server (please complete the following information):

  • OS: Debian 11
  • HTTP server: nginx 1.1
  • Database: MariaDB 10.5.18
  • PHP version: 8.1.14
  • Nextcloud version: 25.0.3
  • Bookmarks app version: 12.0.0
  • Activated Nextcloud Apps:
sudo -u www-data php occ app:list
Enabled:
  - activity: 2.17.0
  - admin_audit: 1.15.0
  - bookmarks: 12.0.0
  - bruteforcesettings: 2.5.0
  - calendar: 4.2.2
  - cloud_federation_api: 1.8.0
  - comments: 1.15.0
  - contacts: 5.0.3
  - contactsinteraction: 1.6.0
  - cospend: 1.5.5
  - dashboard: 7.5.0
  - dav: 1.24.0
  - deck: 1.8.3
  - event_update_notification: 2.0.0
  - federatedfilesharing: 1.15.0
  - federation: 1.15.0
  - files: 1.20.1
  - files_downloadactivity: 1.15.0
  - files_pdfviewer: 2.6.0
  - files_rightclick: 1.4.0
  - files_sharing: 1.17.0
  - files_trashbin: 1.15.0
  - files_versions: 1.18.0
  - holiday_calendars: 0.2.1
  - impersonate: 1.11.0
  - logreader: 2.10.0
  - lookup_server_connector: 1.13.0
  - memories: 4.10.3
  - metadata: 0.17.0
  - news: 20.0.1
  - notes: 4.6.0
  - notifications: 2.13.1
  - notify_push: 0.5.2
  - oauth2: 1.13.0
  - password_policy: 1.15.0
  - photos: 2.0.1
  - previewgenerator: 5.1.1
  - privacy: 1.9.0
  - provisioning_api: 1.15.0
  - quota_warning: 1.15.0
  - richdocuments: 7.1.0
  - serverinfo: 1.15.0
  - settings: 1.7.0
  - sharebymail: 1.15.0
  - spreed: 15.0.3
  - tasks: 0.14.5
  - text: 3.6.0
  - theming: 2.0.1
  - twofactor_backupcodes: 1.14.0
  - twofactor_totp: 7.0.0
  - twofactor_webauthn: 1.0.0
  - updatenotification: 1.15.0
  - user_migration: 2.0.1
  - user_status: 1.5.0
  - viewer: 1.9.0
  - weather_status: 1.5.0
  - workflowengine: 2.7.0
  • Nextcloud external user backend: none

Additional context
Add any other context about the problem here.

Nextcloud log (nextcloud/data/nextcloud.log)

{"reqId":"MSKDpMlnrAv6v2pUYQqn","level":3,"time":"2023-01-31T17:08:03+01:00","remoteAddr":"","user":"--","app":"core","method":"","url":"--","message":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )","userAgent":"--","version":"25.0.3.2","exception":{"Exception":"League\\Uri\\Exceptions\\SyntaxError","Message":"If there is no authority the path `//null` can not start with a `//`.","Code":0,"Trace":[{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Uri.php","line":1359,"function":"assertValidState","class":"League\\Uri\\Uri","type":"->","args":[]},{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Http.php","line":276,"function":"withPath","class":"League\\Uri\\Uri","type":"->","args":["//null"]},{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/UriResolver.php","line":83,"function":"withPath","class":"League\\Uri\\Http","type":"->","args":["//null"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":813,"function":"resolve","class":"League\\Uri\\UriResolver","type":"::","args":[{"__class__":"League\\Uri\\Http"},{"__class__":"League\\Uri\\Http"}]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2240,"function":"toAbsoluteURI","class":"fivefilters\\Readability\\Readability","type":"->","args":["null"]},{"function":"fivefilters\\Readability\\{closure}","class":"fivefilters\\Readability\\Readability","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2237,"function":"preg_replace_callback","args":["/(\\S+)(\\s+[\\d.]+[xw])?(\\s*(?:,|$))/",{"__class__":"Closure"},"null 1x, null 2x"]},{"function":"fivefilters\\Readability\\{closure}","class":"fivefilters\\Readability\\Readability","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2245,"function":"array_walk","args":[[{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMElement","schemaTypeInfo":null,"contentScore":0},{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMElement","schemaTypeInfo":null,"contentScore":0},{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMElement","schemaTypeInfo":null,"contentScore":0},{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMElement","schemaTypeInfo":null,"contentScore":0},{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMElement","schemaTypeInfo":null,"contentScore":0},"And 25 more entries, set log level to debug to see all entries"],{"__class__":"Closure"}]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":258,"function":"postProcessContent","class":"fivefilters\\Readability\\Readability","type":"->","args":[{"__class__":"fivefilters\\Readability\\Nodes\\DOM\\DOMDocument","config":null,"contentScore":0}]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/Service/CrawlService.php","line":125,"function":"parse","class":"fivefilters\\Readability\\Readability","type":"->","args":[{"__class__":"GuzzleHttp\\Psr7\\Stream"}]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/Service/CrawlService.php","line":106,"function":"archiveContent","class":"OCA\\Bookmarks\\Service\\CrawlService","type":"->","args":[{"__class__":"OCA\\Bookmarks\\Db\\Bookmark","id":2998},{"__class__":"GuzzleHttp\\Psr7\\Response"}]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/BackgroundJobs/CrawlJob.php","line":54,"function":"crawl","class":"OCA\\Bookmarks\\Service\\CrawlService","type":"->","args":[{"__class__":"OCA\\Bookmarks\\Db\\Bookmark","id":2998}]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/Job.php","line":78,"function":"run","class":"OCA\\Bookmarks\\BackgroundJobs\\CrawlJob","type":"->","args":[null]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/TimedJob.php","line":103,"function":"start","class":"OCP\\BackgroundJob\\Job","type":"->","args":[{"__class__":"OC\\BackgroundJob\\JobList"}]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/TimedJob.php","line":93,"function":"start","class":"OCP\\BackgroundJob\\TimedJob","type":"->","args":[{"__class__":"OC\\BackgroundJob\\JobList"}]},{"file":"/var/www/nextcloud/cron.php","line":152,"function":"execute","class":"OCP\\BackgroundJob\\TimedJob","type":"->","args":[{"__class__":"OC\\BackgroundJob\\JobList"},{"__class__":"OC\\Log"}]}],"File":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Uri.php","Line":1013,"message":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )","exception":[],"CustomMessage":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )"},"id":"63d93d102c126"}
@simonspa simonspa added the bug label Jan 31, 2023
@simonspa
Copy link
Author

I caught another instance of this. In that case there is a link that starts with

//https://domain.tld/assets/images/0/myimage.jpg

In the original webpage that is bookmarked, the image is linked as

<figure class="image_container float_right">
              <a href="files/somefolder/myimage.jpg" data-lightbox="8e0113">
<img src="assets/images/0/myimage.jpg" width="800" height="521" alt="">
              </a>
    </figure>

so apart from the lightbox there is nothing particularly curious about this...

Still, for every background job attempt from Bookmarks I get an error logged in NC.

@simonspa
Copy link
Author

...and the lightbox is a jquery colorbox:

<script src="assets/jquery/colorbox/1.6.1/js/colorbox.min.js"></script>
<script>
(function($) {
  $(document).ready(function() {
    $('a[data-lightbox]').map(function() {
      $(this).colorbox({
        // Put custom options here
        maxWidth:'90%',
        maxHeight:'90%',
        loop:false,
        rel:$(this).attr('data-lightbox')
      });
    });
  });
})(jQuery);
</script>

@simonspa simonspa changed the title League\Uri\Exceptions\SyntaxError: If there is no authority the path //null can not start with a //. Issue with URL Escaping May 24, 2023
@simonspa
Copy link
Author

I have found another instance of this issue, but in this case it is more clear what the issue is: apparently the URL string is not escaped when passed around and parsed, in the instance I found now there was a literal space instead of %20 and the URL was split at this position:

{"reqId":"RR1Ao3WDeW51RkP8JQ2O","level":3,"time":"2023-05-24T08:03:03+02:00","remoteAddr":"","user":"--","app":"core","method":"","url":"--","message":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )","userAgent":"--","version":"26.0.1.1","exception":{"Exception":"League\\Uri\\Exceptions\\SyntaxError","Message":"If there is no authority the path `//Matratzenbezug.jpg` can not start with a `//`.","Code":0,"Trace":[{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Uri.php","line":1359,"function":"assertValidState","class":"League\\Uri\\Uri","type":"->","args":[]},{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Http.php","line":276,"function":"withPath","class":"League\\Uri\\Uri","type":"->","args":["//Matratzenbezug.jpg"]},{"file":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/UriResolver.php","line":83,"function":"withPath","class":"League\\Uri\\Http","type":"->","args":["//Matratzenbezug.jpg"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":813,"function":"resolve","class":"League\\Uri\\UriResolver","type":"::","args":[["League\\Uri\\Http"],["League\\Uri\\Http"]]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2246,"function":"toAbsoluteURI","class":"fivefilters\\Readability\\Readability","type":"->","args":["Matratzenbezug.jpg?p=n&vh=840500&width=390&height=360&func=bound"]},{"function":"fivefilters\\Readability\\{closure}","class":"fivefilters\\Readability\\Readability","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2243,"function":"preg_replace_callback","args":["/(\\S+)(\\s+[\\d.]+[xw])?(\\s*(?:,|$))/",["Closure"],"https://media.sleep-hero.de/MDE/uploads/product/Tauro Matratzenbezug.jpg?p=n&vh=840500&width=390&height=360&func=bound 2x"]},{"function":"fivefilters\\Readability\\{closure}","class":"fivefilters\\Readability\\Readability","type":"->","args":["*** sensitive parameters replaced ***"]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":2251,"function":"array_walk","args":[["*** sensitive parameters replaced ***","*** sensitive parameters replaced ***","*** sensitive parameters replaced ***","*** sensitive parameters replaced ***","*** sensitive parameters replaced ***","And 62 more entries, set log level to debug to see all entries"],["Closure"]]},{"file":"/var/www/nextcloud/apps/news/vendor/fivefilters/readability.php/src/Readability.php","line":258,"function":"postProcessContent","class":"fivefilters\\Readability\\Readability","type":"->","args":[["fivefilters\\Readability\\Nodes\\DOM\\DOMDocument",null,0]]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/Service/CrawlService.php","line":125,"function":"parse","class":"fivefilters\\Readability\\Readability","type":"->","args":[["GuzzleHttp\\Psr7\\Stream"]]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/Service/CrawlService.php","line":106,"function":"archiveContent","class":"OCA\\Bookmarks\\Service\\CrawlService","type":"->","args":[["OCA\\Bookmarks\\Db\\Bookmark",3221],["GuzzleHttp\\Psr7\\Response"]]},{"file":"/var/www/nextcloud/apps/bookmarks/lib/BackgroundJobs/CrawlJob.php","line":54,"function":"crawl","class":"OCA\\Bookmarks\\Service\\CrawlService","type":"->","args":[["OCA\\Bookmarks\\Db\\Bookmark",3221]]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/Job.php","line":78,"function":"run","class":"OCA\\Bookmarks\\BackgroundJobs\\CrawlJob","type":"->","args":[null]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/TimedJob.php","line":103,"function":"start","class":"OCP\\BackgroundJob\\Job","type":"->","args":[["OC\\BackgroundJob\\JobList"]]},{"file":"/var/www/nextcloud/lib/public/BackgroundJob/TimedJob.php","line":93,"function":"start","class":"OCP\\BackgroundJob\\TimedJob","type":"->","args":[["OC\\BackgroundJob\\JobList"]]},{"file":"/var/www/nextcloud/cron.php","line":152,"function":"execute","class":"OCP\\BackgroundJob\\TimedJob","type":"->","args":[["OC\\BackgroundJob\\JobList"],["OC\\Log"]]}],"File":"/var/www/nextcloud/apps/twofactor_webauthn/vendor/league/uri/src/Uri.php","Line":1013,"message":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )","exception":[],"CustomMessage":"Error while running background job (class: OCA\\Bookmarks\\BackgroundJobs\\CrawlJob, arguments: )"},"id":"646da9214d26f"}

The URL passed around is https://media.sleep-hero.de/MDE/uploads/product/Tauro Matratzenbezug.jpg?p=n&vh=840500&width=390&height=360&func=bound 2x which should likely be https://media.sleep-hero.de/MDE/uploads/product/Tauro%20Matratzenbezug.jpg?p=n&vh=840500&width=390&height=360&func=bound%202x

@marcelklehr would you mind to take a look?

@simonspa
Copy link
Author

simonspa commented Jul 5, 2023

I have again a bookmark with this issue. According to the trace, the issue appears in this call: https://github.com/nextcloud/bookmarks/blob/master/lib/Service/CrawlService.php#L119-L128

However, according to the linked issue above, the problem is not in FiveFilters, because the Readability object is configured correctly:

			$config = new Configuration();
			$config
				->setFixRelativeURLs(true)
				->setOriginalURL($bookmark->getUrl())
				->setSubstituteEntities(true);
			$readability = new Readability($config);

In any case, the ParseException does not seem to be caught properly, leading to the exception being propagated all the way up.

@marcelklehr
Copy link
Member

Hey @simonspa! I'm sorry for leaving you out in the cold on this one. Taking a look now.

@marcelklehr
Copy link
Member

marcelklehr commented Jul 9, 2023

It would appear the problematic URL with the space can't be fixed because the website appears to be using a srcset attribute where whitespace is significant (used to separate multiple URLs). That's a bug in the website.

@marcelklehr
Copy link
Member

marcelklehr commented Jul 9, 2023

I've extended the catch handler to cover whatever Readability might throw.

@simonspa
Copy link
Author

simonspa commented Jul 9, 2023

Cool, thanks a lot @marcelklehr ! 😻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants