In [1]:
from geoservices_bayern_scraping import building_plans, bounding_boxes

# Scraping Building Plans of Bayern

The [Geoportal of Bayern](https://geoportal.bayern.de/bauleitplanungsportal/karte.html) has an API that contains the direct links to different building plans across the region. The API takes as input a bounding box that will define the are over which the plans will be scraped. This code shows how to use two functions elaborated to scrape the entire API: one that defines the bounding boxes that will cover all of Bayern and one that efficiently downloads the information into csvs. 

## Step 1: Define the regions

- Define the bounding box for the regions of Bavaria you want to scrape. In this example, we put the bounding box for the entire region. 
- Adjust the parameter `sample_n` to the number of bounding boxes to use for the sample, or remove it to keep all of them. 

In [2]:
bavaria_bounding_box =  (4195669.333333333, 4998144, 4724053.333333333, 5766144)

bounding_boxes = bounding_boxes.generate_sub_bboxes(sample_n = 300, 
                                                    bounding_box = bavaria_bounding_box)

## Step 2: Run the scraper

- Adjust `batch_size` for the number of items to be ran in each batch iteration.
- Adjust `batch_delay` to make a longer or shorter pause between batches. Helps in reducing the load on the server and avoids potential IP blocking or other restrictions due to rapid sequential requests.
- Adjust `max_retries` for number of times to try each download if failed. 
- Change name of output folder adjusting `output_folder`.

In [4]:
building_plans.scrape_in_batches(bounding_boxes,
                      batch_size = 100,
                      batch_delay = 30, 
                      max_retries = 5,
                      output_folder = 'geoservices_results_sample')

100%|██████████| 100/100 [19:24<00:00, 11.65s/it]
100%|██████████| 100/100 [19:39<00:00, 11.79s/it]
100%|██████████| 100/100 [19:24<00:00, 11.65s/it]
