-
-
Notifications
You must be signed in to change notification settings - Fork 292
Description
I tried to scrape webpage with User_Agent request header string
request: {
headers: {
'User-Agent': 'Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/40.0.0 Mobile Safari/535.19'
}
}
then scrapped website rendered properly.
When I change User_Agent request header string to
request: {
headers: {
'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Mobile Safari/537.36'
}
}
I am able to scrape css files too with scrapped website but CSS styles are not applied on scrapped website, here is code :
var options = {
urls: [{
url : actualURL
}],
request: {
headers: {
'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.96 Mobile Safari/537.36'
}
},
ignoreErrors : true,
directory: DIRECTORY_PATH,
subdirectories: [
{directory: 'img', extensions: ['.jpg', '.png', '.svg']},
{directory: 'js', extensions: ['.js']},
{directory: 'css', extensions: ['.css']},
{directory: 'fonts', extensions: ['.woff', '.otf', '.ttf', '.eot']}
],
httpResponseHandler: (response) => {
if (response.statusCode === 404) {
return Promise.reject(new Error('status is 404'));
}
else if (response.statusCode === 444) {
console.log("Connection Closed");
return Promise.reject(new Error('status is 444'));
} else {
return Promise.resolve({
body: response.body,
metadata: {
headers: response.headers,
},
});
}
},
onResourceSaved : (resource) => {
console.log(Resource ${resource} saved to filesystem);
},
onResourceError : (resource, err) => {
console.log(Resources ${resource} not saved because of ${err});
}
};