# Example using SpeciesDownloader

In [6]:
# To run this notebook:
# 1) install python3, git
# 2) run 
#    a) Windows: cmd
#    b) MacOS/Linux: terminal
# 3) git clone
# 4) cd species_img_downloader
# 5)
#    a)Windows: python -m venv venv
#    b)MacOS/Linux: python3 -m venv venv
# 6) pip install -r requirements.txt
# 7) jupyter notebook

### In example, we want prepare dataset for training bird species classifier that will be used in cameras in Saint-Petersburg, Russia. So we need:
* Prepare a list of names of bird species that live in St. Petersburg.
* Сollect enough bird photos for each species. 

### Prepare species list.

It will be diffucult to collect all species in Saint-Petersburg manually, so we wll scrap list from avibase.bsc-eoc.org

In [1]:
from bs4 import BeautifulSoup
import requests

req = requests.get('https://avibase.bsc-eoc.org/checklist.jsp?region=RUnwsp&list=howardmoore')
soup = BeautifulSoup(req.content, 'html.parser')
species_list = []
for row in soup.findAll('i'):
    text = row.text
    if text:
        species_list.append(text)

In [5]:
print(f'Total species: {len(species_list)}')
print(f'First 10: {species_list[:10]}')

Total species: 257
First 10: ['Oxyura leucocephala', 'Cygnus olor', 'Cygnus columbianus', 'Cygnus cygnus', 'Branta bernicla', 'Branta leucopsis', 'Branta canadensis', 'Anser anser', 'Anser fabalis', 'Anser albifrons']


Now we have the list of species names. Let's download photos per species (we take only 2 species and 10 photos per species for example, but we can take 100, 1000 and even more photos).

In [7]:
from species_img_downloader import SpeciesDownloader

download_path = './data/'
img_count = 10
downloader = SpeciesDownloader(species_names=species_list[:2])
downloader.download(save_dir='./data', limit_img_per_species=img_count)

2021-08-18 16:20:55,965 INFO Start getting species images urls from INAT API
  5%|██████████▍                                                                                                                                                                                                     | 1/20 [00:06<02:11,  6.94s/it]2021-08-18 16:21:02,924 INFO 11 URLS found for Oxyura leucocephala
 60%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                                                                                  | 12/20 [00:11<00:06,  1.21it/s]2021-08-18 16:21:07,752 INFO 11 URLS found for Cygnus olor
22it [00:11,  1.87it/s]                                                                                                                                                                                                                             
2021-08-18 16:21:07,753 INFO Start preparing save paths
2021-08-