-
Notifications
You must be signed in to change notification settings - Fork 679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assets from minified CSS not downloaded #169
Comments
As most of modern web sites are now minified, this is a major problem |
In htsparse.c, we have on line 1348:
It's supposed to be supported !?! |
I'm experiencing the same issue - the parser does find and download the first asset it can find in a minified CSS, but then stops. I've done some testing, and the only way to fix this it seems is to place at least one line break character between each asset URL. Effectively that means working with minified CSS (and probably other files falling under the "javascript" parser code path) isn't possible right now. A minimal reproducible example is for instance:
It will get the first image, but not the second. This however works fine:
I could imagine a fix shouldn't be too hard. I took a look at https://github.com/xroche/httrack/blob/master/src/htsparse.c but the code is quite a challenge to grasp (I may give it another shot once I have more time on my hands). To anyone who is already familiar with this code, I believe fixing this might be quite worthwhile considering most CSS etc. is minified today. |
Background images from minified CSS were not downloaded. In minified CSS they are stored like this:
The text was updated successfully, but these errors were encountered: