### Projekt objektově orientovaného programování

---

Na závěr nás čeká ukázka z praktického projektu.

<br>

Projekt bude mít tyto části:
1. [Sběr dat, OOP,](#Sběr-dat,-OOP)
2. [Analýza dat, pandas,]()
3. [Uložení dat, sqlite3,]()
4. [Interpretace dat, flask.]()

<br>

#### Sběr dat, OOP

---

Nachystání virtuálního prostředí:
```
$ python -m venv projekt04       # vytvořím virt. prostředí
$ source projekt04/bin/activate  # aktivuji binárku virt. prostředí
```

<br>

Instalace souvisejících knihoven:
```
$ pip --version  # kontrola manažeru
$ pip install -r requirements.txt

Collecting beautifulsoup4==4.10.0
  Using cached beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
Collecting certifi==2021.10.8
...
```

<br>

V rootovi projektu vytvoř soubor s dokumentací `README.md`:
```
#### Projekt ze čtvrté lekce

---

Tento projekt je zaměřený na zopakování konceptů z kurzu Engeto, úvod do OOP.

<br>

#### Instalace

---

Vytvoř virtuální prostředí a nainstaluj manažerem související knihovny:

<br>

#### Spuštění
Pro správnou inicializaci použij příkaz:

---

<br>

#### Poznámky

---

```

<br>

Vytvoř adresář pro nový balíček:
```
/projekt04
    ├─requirements.txt
    ├─README.md
    └─muj_balicek
       ├─__init__.py
       ├─collector.py
       ├─processor.py
       └─db.py
```

###### Importování knihoven

---

In [4]:
"""
# náhled na údaj
# --------------
example: Dict[str, Union[str, float, int]] = 
{'gps': '{"lat":49.23727,"lng":16.58296}',
 'price': 6499000,
 'currency': 'CZK',
 'key_offer_type': 'prodej',
 'key_estate_type': 'byt',
 'key_disposition': '2-1',
 'surface': 60,
 'surface_land': 0}
"""

import json
from urllib.parse import urljoin
from typing import Dict, List, Union

import bs4
import requests

#### Vytvoření objektu pro naše data

---

In [12]:
class BRRealEstateOffer:
    """Create a new object from the given attributes."""
    offer_count: int = 0
    
    def __init__(
        self, id_: int, url: str,
        details: Dict[str, Union[str, int, float]]
    ):
        self.id_ = id_
        self.url = url
        self.details = details

    @classmethod
    def add_offer(cls):
        cls.offer_count += 1
        
    def __repr__(self) -> str:
        return str(f"{self.url}")

In [13]:
example: Dict[str, Union[str, float, int]] = {
    'id': 695305,
    'url': "https://www.bezrealitky.cz/nemovitosti-byty-domy/695305-nabidka-prodej-bytu-ostruzinova-brno",
    'gps': '{"lat":49.23727,"lng":16.58296}',
    'price': 6499000,
    'currency': 'CZK',
    'key_offer_type': 'prodej',
    'key_estate_type': 'byt',
    'key_disposition': '2-1',
    'surface': 60,
    'surface_land': 0
}

offer_1 = BRRealEstateOffer(
    id_=example.pop('id'),
    url=example.pop('url'),
    details=example,
)

In [14]:
offer_1.__dict__

{'id_': 695305,
 'url': 'https://www.bezrealitky.cz/nemovitosti-byty-domy/695305-nabidka-prodej-bytu-ostruzinova-brno',
 'details': {'gps': '{"lat":49.23727,"lng":16.58296}',
  'price': 6499000,
  'currency': 'CZK',
  'key_offer_type': 'prodej',
  'key_estate_type': 'byt',
  'key_disposition': '2-1',
  'surface': 60,
  'surface_land': 0}}

In [20]:
class ScraperInitiator:
    """Initiate a new object for web-scraping."""
    
    def __init__(self, url: str, params: Dict[str, str]):
        self.url = url
        self.params = params
        
    def send_post_request(self) -> requests.models.Response:
        return requests.post(self.url, params=self.params)
    
    @staticmethod
    def load_json(response: bs4.BeautifulSoup) -> List[Dict[str, str]]:
        """Load the 'json' package and read the content from string."""
        return json.loads(response.text)

In [21]:
scraper_1 = ScraperInitiator(
    "https://www.bezrealitky.cz/api/record/markers",
    {
        'offerType': 'prodej',
        'submit': '1',
        'boundary': '[[[{"lat":52,"lng":12},{"lat":52,"lng":16},{"lat":50,"lng":16},{"lat":50,"lng":12},{"lat":52,"lng":12}]]]'
    }
)

In [22]:
json_file = scraper_1.load_json(
    scraper_1.send_post_request()
)

In [23]:
type(json_file)

list

In [24]:
json_file[0]

{'id': '695544',
 'uri': '695544-nabidka-prodej-bytu-skolni-cheb',
 'keyAdvertType': 'estate_offer',
 'type': '',
 'timeOrder': {'date': '2021-12-08 11:00:04.000000',
  'timezone_type': 3,
  'timezone': 'Europe/Berlin'},
 'orderPriority': 0,
 'advertEstateOffer': [{'gps': '{"lat":50.07845,"lng":12.37209}',
   'price': 2500000,
   'currency': 'CZK',
   'keyOfferType': 'prodej',
   'keyEstateType': 'byt',
   'keyDisposition': '2-1',
   'surface': 55,
   'surfaceLand': 0,
   'id': '695544'}]}

In [85]:

# from geopy.geocoders import Nominatim

# Získáme data z API
url: str = "https://www.bezrealitky.cz/api/record/markers"
params: Dict[str, str] = {
    'offerType': 'prodej',
    'submit': '1',
    'boundary': '[[[{"lat":52,"lng":12},{"lat":52,"lng":16},{"lat":50,"lng":16},{"lat":50,"lng":12},{"lat":52,"lng":12}]]]'
}
    
# Pošli požadavek
response: requests.models.Response = requests.post(url, params)

In [86]:
# Zkontroluj data
result: List[Dict[str, str]] = json.loads(response.text)

In [87]:
# Parsuj data


def iterate_through_data(data: List[Dict[str, str]]):
    details = list()

    for dict_ in data:
        id_, uri, attrs = parse_dict_object(dict_)
        values: Dict[str, str] = {
            "id": id_,
            "url": urljoin(
                "https://www.bezrealitky.cz/nemovitosti-byty-domy/695305-nabidka-prodej-bytu-ostruzinova-brno",
                uri
            )
        }
        values.update(parse_nested_dict(attrs))
        # values["complete_address"] = find_complete_address(values.get("gps"))
        details.append(values)
    
    return details


def parse_dict_object(auc: Dict[str, str]):
    """Parse the attributes from the given dictionary."""
    id_ = auc.get("id")
    uri = auc.get("uri")
    details = auc.get("advertEstateOffer")[0]
    
    return id_, uri, details


def parse_nested_dict(details: Dict[str, Union[int, str]]):
    """Insite the nested dictionary, get the keys and their values."""

    return {
        "gps": details.get("gps"),
        "price": details.get("price"),
        "currency": details.get("currency"),
        "key_offer_type": details.get("keyOfferType"),
        "key_estate_type": details.get("keyEstateType"),
        "key_disposition": details.get("keyDisposition"),
        "surface": details.get("surface"),
        "surface_land": details.get("surfaceLand"),
        "surface": details.get("surface"),
    }


def find_complete_address(key: str) -> str:
    """Find the address with the gps coordinates."""
    lat, lng = json.loads(key.values())
    join_coords = ", ".join((str(lat), str(lng)))

    geolocator = Nominatim(
        user_agent="specify_your_app_name_here"
    )

    return geolocator.reverse(join_coords)
    

results = iterate_through_data(result)

In [88]:
len(results)
# results[10: 15]

898

In [36]:
uri

'695305-nabidka-prodej-bytu-ostruzinova-brno'

In [37]:
details

{'gps': '{"lat":49.23727,"lng":16.58296}',
 'price': 6499000,
 'currency': 'CZK',
 'key_offer_type': 'prodej',
 'key_estate_type': 'byt',
 'key_disposition': '2-1',
 'surface': 60,
 'surface_land': 0}

In [71]:
# from geopy.geocoders import Nominatim

lat, lng = json.loads(values.get("gps")).values()
join_coords = ", ".join((str(lat), str(lng)))
geolocator = Nominatim(user_agent="specify_your_app_name_here")
location = geolocator.reverse(join_coords)
values["complete_address"] = location

---