# Webcrawler für VogelscheucheIP

## Information

### Creadits

Inspried by: [Vipul dilip gote](https://medium.com/@vipulgote4/guide-to-make-custom-haar-cascade-xml-file-for-object-detection-with-opencv-6932e22c3f0e)
Geschrieben von Yannick Otten

## Programming

### Import

In [1]:
from icrawler.builtin import BingImageCrawler  # can crawl bing (for images)
from pathlib import Path  # create savely nested paths
from typing import List  # typehints for better readability
import shutil  # for os copy and move operations
import tqdm  # for progress bars
import os

### Crawler

In [2]:
def crawl_bing(classes: List[str], storage_path: str, number: int = 1):
    # iterate classes
    for c in classes:
        class_storage_path = os.path.join(storage_path, c.replace(" ", "."))

        # create path
        Path(class_storage_path).mkdir(parents=True, exist_ok=True)

        # create crawler
        crawler = BingImageCrawler(storage={'root_dir': class_storage_path})
        crawler.crawl(keyword=c, filters=None, max_num=number, offset=0)

### Positive images

Images the cascade should detect.

In [3]:
crawl_bing(
[
    'abbotts babbler', 'abbotts booby', 'abyssinian ground hornbill', 'african crowned crane', 
    'african emerald cuckoo', 'african firefinch', 'african oyster catcher', 'african pied hornbill', 
    'albatross', 'alberts towhee', 'alexandrine parakeet', 'alpine chough', 'altamira yellowthroat', 
    'american avocet', 'american bittern', 'american coot', 'american flamingo', 'american goldfinch', 
    'american kestrel', 'american pipit', 'american redstart', 'american wigeon', 'amethyst woodstar', 
    'andean goose', 'andean lapwing', 'andean siskin', 'anhinga', 'anianiau', 'annas hummingbird', 
    'antbird', 'antillean euphonia', 'apapane', 'apostlebird', 'araripe manakin', 'ashy storm petrel', 
    'ashy thrushbird', 'asian crested ibis', 'asian dollard bird', 'auckland shaq', 'austral canastero', 
    'australasian figbird', 'avadavat', 'azaras spinetail', 'azure breasted pitta', 'azure jay', 
    'azure tanager', 'azure tit', 'baikal teal', 'bald eagle', 'bald ibis', 'bali starling', 
    'baltimore oriole', 'bananaquit', 'band tailed guan', 'banded broadbill', 'banded pita', 
    'banded stilt', 'bar-tailed godwit', 'barn owl', 'barn swallow', 'barred puffbird', 
    'barrows goldeneye', 'bay-breasted warbler', 'bearded barbet', 'bearded bellbird', 
    'bearded reedling', 'belted kingfisher', 'bird of paradise', 'black & yellow broadbill', 
    'black baza', 'black cockato', 'black francolin', 'black skimmer', 'black swan', 
    'black tail crake', 'black throated bushtit', 'black throated warbler', 
    'black vented shearwater', 'black vulture', 'black-capped chickadee', 'black-necked grebe', 
    'black-throated sparrow', 'blackburniam warbler', 'blonde crested woodpecker', 
    'blood pheasant', 'blue coau', 'blue dacnis', 'blue grouse', 'blue heron', 
    'blue malkoha', 'blue throated toucanet', 'bobolink', 'bornean bristlehead', 
    'bornean leafbird', 'bornean pheasant', 'brandt cormarant', 'brewers blackbird', 
    'brown crepper', 'brown noody', 'brown thrasher', 'bufflehead', 'bulwers pheasant', 
    'burchells courser', 'bush turkey', 'caatinga cacholote', 'cactus wren', 
    'california condor', 'california gull', 'california quail', 'campo flicker', 
    'canary', 'cape glossy starling', 'cape longclaw', 'cape may warbler', 
    'cape rock thrush', 'capped heron', 'capuchinbird', 'carmine bee-eater', 
    'caspian tern', 'cassowary', 'cedar waxwing', 'cerulean warbler', 
    'chara de collar', 'chattering lory', 'chestnet bellied euphonia', 
    'chinese bamboo partridge', 'chinese pond heron', 'chipping sparrow', 
    'chucao tapaculo', 'chukar partridge', 'cinnamon attila', 'cinnamon flycatcher', 
    'cinnamon teal', 'clarks nutcracker', 'cock of the  rock', 'cockatoo', 
    'collared aracari', 'common firecrest', 'common grackle', 'common house martin', 
    'common iora', 'common loon', 'common poorwill', 'common starling', 
    'coppery tailed coucal', 'crab plover', 'crane hawk', 'cream colored woodpecker', 
    'crested auklet', 'crested caracara', 'crested coua',
'crested fireback', 'crested kingfisher', 'crested nuthatch', 'crested oropendola', 
    'crested shriketit', 'crimson chat', 'crimson sunbird', 'crow', 'crowned pigeon', 
    'cuban tody', 'cuban trogon', 'curl crested aracuri', "d-arnauds barbet", 
    'dalmatian pelican', 'darjeeling woodpecker', 'dark eyed junco', 'darwins flycatcher', 
    'daurian redstart', 'demoiselle crane', 'double barred finch', 'double brested cormarant', 
    'double eyed fig parrot', 'downy woodpecker', 'dusky lory', 'dusky robin', 
    'eared pita', 'eastern bluebird', 'eastern bluebonnet', 'eastern golden weaver', 
    'eastern meadowlark', 'eastern rosella', 'eastern towee', 'eastern wip poor will', 
    'ecuadorian hillstar', 'egyptian goose', 'elegant trogon', 'elliots pheasant', 
    'emerald tanager', 'emperor penguin', 'emu', 'enggano myna', 'eurasian bullfinch', 
    'eurasian golden oriole', 'eurasian magpie', 'european goldfinch', 'european turtle dove', 
    'evening grosbeak', 'fairy bluebird', 'fairy penguin', 'fairy tern', 'fan tailed widow', 
    'fasciated wren', 'fiery minivet', 'fiordland penguin', 'fire tailled myzornis', 
    'flame bowerbird', 'flame tanager', 'frigate', 'gambels quail', 'gang gang cockatoo', 
    'gila woodpecker', 'gilded flicker', 'glossy ibis', 'go away bird', 'gold wing warbler', 
    'golden bower bird', 'golden cheeked warbler', 'golden chlorophonia', 'golden eagle', 
    'golden parakeet', 'golden pheasant', 'golden pipit', 'gouldian finch', 'grandala', 
    'gray catbird', 'gray kingbird', 'gray partridge', 'great gray owl', 'great jacamar', 
    'great kiskadee', 'great potoo', 'great tinamou', 'great xenops', 'greater pewee', 
    'greator sage grouse', 'green broadbill', 'green jay', 'green magpie', 'grey cuckooshrike', 
    'grey plover', 'groved billed ani', 'guinea turaco', 'guineafowl', 'gurneys pitta', 
    'gyrfalcon', 'hamerkop', 'harlequin duck', 'harlequin quail', 'harpy eagle', 
    'hawaiian goose', 'hawfinch', 'helmet vanga', 'hepatic tanager', 'himalayan bluetail', 
    'himalayan monal', 'hoatzin', 'hooded merganser', 'hoopoes', 'horned guan', 
    'horned lark', 'horned sungem', 'house finch', 'house sparrow', 'hyacinth macaw', 
    'iberian magpie', 'ibisbill', 'imperial shaq', 'inca tern', 'indian bustard', 
    'indian pitta', 'indian roller', 'indian vulture', 'indigo bunting', 'indigo flycatcher', 
    'inland dotterel', 'ivory billed aracari', 'ivory gull', 'iwi', 'jabiru', 
    'jack snipe', 'jandaya parakeet', 'japanese robin', 'java sparrow', 'jocotoco antpitta', 
    'kagu', 'kakapo', 'killdear', 'king eider', 'king vulture', 'kiwi', 'kookaburra', 
    'lark bunting', 'lazuli bunting', 'lesser adjutant', 'lilac roller', 'little auk', 
    'loggerhead shrike', 'long-eared owl', 'magpie goose', 'malabar hornbill',
'malachite kingfisher', 'malagasy white eye', 'maleo', 'mallard duck', 'mandrin duck', 
    'mangrove cuckoo', 'marabou stork', 'masked booby', 'masked lapwing', 'mckays bunting', 
    'mikado pheasant', 'mourning dove', 'myna', 'nicobar pigeon', 'noisy friarbird', 
    'northern beardless tyrannulet', 'northern cardinal', 'northern flicker', 'northern fulmar', 
    'northern gannet', 'northern goshawk', 'northern jacana', 'northern mockingbird', 
    'northern parula', 'northern red bishop', 'northern shoveler', 'ocellated turkey', 
    'okinawa rail', 'orange brested bunting', 'oriental bay owl', 'osprey', 'ostrich', 
    'ovenbird', 'oyster catcher', 'painted bunting', 'palila', 'paradise tanager', 
    'paraket akulet', 'parus major', 'patagonian sierra finch', 'peacock', 'peregrine falcon', 
    'philippine eagle', 'pink robin', 'pomarine jaeger', 'puffin', 'purple finch', 'purple gallinule', 
    'purple martin', 'purple swamphen', 'pygmy kingfisher', 'quetzal', 'rainbow lorikeet', 
    'razorbill', 'red bearded bee eater', 'red bellied pitta', 'red browed finch', 
    'red faced cormorant', 'red faced warbler', 'red fody', 'red headed duck', 'red headed woodpecker', 
    'red honey creeper', 'red naped trogon', 'red tailed hawk', 'red tailed thrush', 
    'red winged blackbird', 'red wiskered bulbul', 'regent bowerbird', 'ring-necked pheasant', 
    'roadrunner', 'robin', 'rock dove', 'rosy faced lovebird', 'rough leg buzzard', 
    'royal flycatcher', 'ruby throated hummingbird', 'rudy kingfisher', 'rufous kingfisher', 
    'rufuos motmot', 'samatran thrush', 'sand martin', 'sandhill crane', 'satyr tragopan', 
    'scarlet crowned fruit dove', 'scarlet ibis', 'scarlet macaw', 'scarlet tanager', 'shoebill', 
    'short billed dowitcher', 'skua', 'smiths longspur', 'snowy egret', 'snowy owl', 'snowy plover', 
    'sora', 'spangled cotinga', 'splendid wren', 'spoon biled sandpiper', 'spoonbill', 'spotted catbird', 
    'sri lanka blue magpie', 'steamer duck', 'stork billed kingfisher', 'strawberry finch', 
    'striped owl', 'stripped manakin', 'stripped swallow', 'superb starling', 'swinhoes pheasant', 
    'tailorbird', 'taiwan magpie', 'takahe', 'tasmanian hen', 'teal duck', 'tit mouse', 'touchan', 
    'townsends warbler', 'tree swallow', 'tricolored blackbird', 'tropical kingbird', 
    'trumpter swan', 'turkey vulture', 'turquoise motmot', 'umbrella bird', 'varied thrush', 'veery', 
    'venezuelaian troupial', 'vermillion flycatcher', 'victoria crowned pigeon', 'violet green swallow', 
    'violet turaco', 'vulturine guineafowl', 'wall creeper', 'wattled curassow', 'wattled lapwing', 
    'whimbrel', 'white browed crake', 'white cheeked turaco', 'white crested hornbill', 
    'white necked raven', 'white tailed tropic', 'white throated bee eater', 'wild turkey', 
    'wilsons bird of paradise', 'wood duck', 'yellow bellied flowerpecker', 'yellow cacique', 
    'yellow headed blackbird'
], r"storage\positive", 250)

2024-12-16 20:00:52,225 - INFO - icrawler.crawler - start crawling...
2024-12-16 20:00:52,226 - INFO - icrawler.crawler - starting 1 feeder threads...
2024-12-16 20:00:52,227 - INFO - icrawler.crawler - starting 1 parser threads...
2024-12-16 20:00:52,228 - INFO - icrawler.crawler - starting 1 downloader threads...
2024-12-16 20:00:52,680 - INFO - parser - parsing result page https://www.bing.com/images/async?q=pond forest&first=0
2024-12-16 20:00:53,097 - INFO - downloader - image #1	https://www.wallpaperflare.com/static/888/416/34/riffles-pond-green-forest-landscape-wallpaper-preview.jpg
2024-12-16 20:00:53,719 - INFO - downloader - image #2	https://images6.alphacoders.com/784/784190.jpg
2024-12-16 20:00:54,667 - INFO - downloader - image #3	https://cdn.pixabay.com/photo/2014/03/18/23/15/pond-290313_1280.jpg
2024-12-16 20:00:54,826 - ERROR - downloader - Response status code 403, file https://www.hdwallpapers.in/download/pond_in_forest_5k-5120x2880.jpg
2024-12-16 20:00:55,235 - INFO 