Skip to content

cleverlybulk/ImageScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImageScraper :

Scraping script using python3 and urllib, it downloads all images in a given website.

description :

this script has two classes first one gets a max amount of urls (the variable max_urls = 50 ) and second class searches each url and downloads every picture in that webpage.

updates

* archiever added (which store data in "visitedUrls.csv")
* the script doesn't download a pic twice thanks to the archiever module

my motivations :

improve my python skills and share something that could be usefull for others :) please donate here : www.assabbane.com

how to use the script:

1. First you shold verrify that you have all the dipendencies , and by dipendencies i mean python3 and the libraries imported in top of the script "imageScraper.py" 
2. how to execute the script:

* get into the folder containing the script 

```
cd ImageScraper
```

* then execute the script (inside the script change the target website the variable <'site =  "https://www.example.com"'> by your own target or you can just test with the given website "https://www.florajet.com/", if you want to encrease results change the value of 'max_urls' to a higher number )

```python
python3 imageScraper.py
```
* you can stop the script using ctrl+c or ctrl+z

limites of the script and in case it doesn't work:

some website could detect the scraper i already bypassed that using some proxies and headers but i could not put that in the repo since it's not very legal, so if it doesn't work on your target you should be ashamed of what you are doing!!

if i helped you you can always by me a coffe on : www.assabbane.com

About

Scraping script using python3 and urllib, it downloads all images in a given website.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages