# Flightradar analysis

In this task we will use https://www.flightradar24.com/ website to monitor airplanes near Innopolis. For this we will use some information from URL, and from moving airplane items. Let's have fun with airplanes!

**NB:** This lab is designed to be executed **locally** at your laptop, as it launches local application (browser). Indeed, headless mode can be used in colab, but this would also require specific browser installation steps. Thus, please use Anaconda.

## Dependency installation

Let's try to load and parse the page the way we did before:

In [1]:
import requests
from bs4 import BeautifulSoup
resp = requests.get("https://www.flightradar24.com/")
print("Status:", resp.status_code)

Status: 451


Wowowow! https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/451 :

```
The HyperText Transfer Protocol (HTTP) '451 Unavailable For Legal Reasons' client error response code indicates that the user requested a resource that is not available due to legal reasons, such as a web page for which a legal action has been issued.
```

As we see, the output is not what we would expect. So, what can we do when a page is not being loaded right away, but is rather rendered by a script, and only in a valid browser? Browser engines can help us getting the data. Let's try to load the same web page, but do it in a different way: let's give a browser some time to load the scripts and run them. And then we will work with DOM (Document Object Model), but we will obtain this DOM from the browser engine itself, not via `BeautifulSoup`.

Where do we get browser engine from? Simply installing a browser will do the thing. How do we send commands to it from the code, and retrieve the DOM? Service applications called `drivers` will interpret commands and translate them into browser actions.

For each supported browser engine you will need to:
1. install browser itself;
2. download 'driver' - binary executable, which passed commands from selenium to browser. E.g. [Gecko = Firefox](https://github.com/mozilla/geckodriver/releases), [ChromeDriver](http://chromedriver.storage.googleapis.com/index.html);
3. unpack driver into a folder under PATH environment variable. Or specify exact binary location when you write the code.

### Download driver

And place it in any folder or under PATH env. variable. [Firefox](https://github.com/mozilla/geckodriver/releases), [Chrome](http://chromedriver.storage.googleapis.com/index.html)

### Install selenium

Selenium is a powerful tool for automated UI testing. We will use it to emulate used actions with the website.

In [4]:
!pip install -U selenium

Collecting selenium
  Using cached selenium-4.8.0-py3-none-any.whl (6.3 MB)
Collecting trio-websocket~=0.9
  Using cached trio_websocket-0.9.2-py3-none-any.whl (16 kB)
Collecting trio~=0.17
  Using cached trio-0.22.0-py3-none-any.whl (384 kB)
Collecting exceptiongroup>=1.0.0rc9
  Using cached exceptiongroup-1.1.0-py3-none-any.whl (14 kB)
Collecting outcome
  Using cached outcome-1.2.0-py2.py3-none-any.whl (9.7 kB)
Collecting async-generator>=1.9
  Using cached async_generator-1.10-py3-none-any.whl (18 kB)
Collecting wsproto>=0.14
  Using cached wsproto-1.2.0-py3-none-any.whl (24 kB)
Collecting h11<1,>=0.9.0
  Using cached h11-0.14.0-py3-none-any.whl (58 kB)
Installing collected packages: outcome, h11, exceptiongroup, async-generator, wsproto, trio, trio-websocket, selenium
Successfully installed async-generator-1.10 exceptiongroup-1.1.0 h11-0.14.0 outcome-1.2.0 selenium-4.8.0 trio-0.22.0 trio-websocket-0.9.2 wsproto-1.2.0


Check it works

In [16]:
from selenium import webdriver

### Launch browser

This will open a browser window

In [43]:
# browser = webdriver.Firefox()
# or explicitly
# browser = webdriver.Firefox(
#     executable_path='C:/bin/geckodriver.exe', 
#     firefox_binary='C:/Program Files/Mozilla Firefox/firefox.exe'
# )
browser = webdriver.Chrome(executable_path="/home/iviosab/Downloads/drivers/chromedriver")

  browser = webdriver.Chrome(executable_path="/home/iviosab/Downloads/drivers/chromedriver")


### Download the page ... again

In [44]:
from selenium.webdriver.common.by import By

# navigate to page
browser.get('https://www.flightradar24.com/')
browser.implicitly_wait(10)  # wait for 10 seconds

# select all visible airplanes from document
elements = browser.find_elements(By.CSS_SELECTOR, "div[role=button]")
# note that if number differs from launch to launch this means better extend wait time
print("Elements found:", len(elements))

Elements found: 43


### Preparatory functions

We will center our map around Innopolis, and choose one of the suitable scales.

In [45]:
innopolis = "55.75,48.75"
scale = 9


def scale_km_per_px(scale):
    return 2 ** 8 / 3 / (2 ** scale)


def dist(a, b):
    return ((a[0] - b[0]) ** 2 + (a[1] - b[1]) ** 2) ** .5

## Solving the problem

### Obtain center coordinates

First task is to get pixel coordinates of the screen center. You are given a browser instance object. And we are interested, what is the size of the rendered page (NB not the window!)? For this you will do the following:
1. find the root `html` tag by tag name. Refer of [`find_element` documentation](https://selenium-python.readthedocs.io/locating-elements.html) and [`By` options](https://www.selenium.dev/selenium/docs/api/py/webdriver/selenium.webdriver.common.by.html).
2. Extract **attribute** values of this tag. We are interested in `clientWidth` and `clientHeight`. [See this doc](https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.get_attribute) for usage.
3. Divide these values by 2 and return as a tuple.

In [46]:
def get_center_point(browser):
    html = browser.find_element(By.TAG_NAME, "html")
    inner_width = int(html.get_attribute("clientWidth"))
    inner_height = int(html.get_attribute("clientHeight"))
    # in center
    innopolis_px = (inner_width / 2, inner_height / 2)
    return innopolis_px

print(get_center_point(browser))


(432.0, 431.0)


### Catching the airplane

This code will search for airplane and airport images and their coordinates on the map. Your task is to complete the check if this icon is an airport, or an airplane.

Airport example:

```
<div style="width: 20px; height: 20px; overflow: hidden; position: absolute; cursor: pointer; touch-action: none; left: 131px; top: -89px; z-index: 1090430;" title="Yoshkar-Ola Airport (JOK/UWKJ)" aria-label="Yoshkar-Ola Airport (JOK/UWKJ)" role="button" tabindex="0">...</div>
```

Airplane example:

```
<div style="width: 33px; height: 33px; overflow: hidden; position: absolute; cursor: pointer; touch-action: none; left: -30px; top: 17px; z-index: 1031004;" title="" role="button" tabindex="-1">...</div>
```

Again, I think [get_attribute(...) call](https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.get_attribute) can help in distinguishing these.

In [48]:
def spot_some_air_stuff(browser):
    # these are all the elements, corresponding to the desired filter
    elements = browser.find_elements(By.CSS_SELECTOR, "div[role=button][tabindex='-1']")
    airports = []
    airplanes = []
    
    for element in elements:
        aria = element.get_attribute("aria-label")
        if aria:
            airports.append(element)
        else:
            airplanes.append(element)
    return airports, airplanes

print(spot_some_air_stuff(browser))

([<selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="3f23da0a-c094-424a-85bc-70195ae404ff")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="db7f291d-653d-4e3e-8dde-48cc3a2af6ba")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="9dd7686d-2d9f-4f0e-9d59-aebf54085b6d")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="356db202-fd83-4501-ad01-ec552ab10f83")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="ec35a3e5-f09e-41d1-a8fb-6353eb3d6ba7")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="c9d01d3d-9cde-4ffe-b886-f11438cb2b33")>, <selenium.webdriver.remote.webelement.WebElement (session="54d294bb9734bd5d77fc4ad4ac762ae1", element="085c6d96-18d4-420b-8333-22c3d04

### Get the info from the pane

When we click on the airplane image, a side pane appears. We will read the info from this pane.

In [49]:
def get_flight_info(browser):
    flight = browser.find_element(By.CSS_SELECTOR, 'h2.airline-info__flight-no')
    dep = browser.find_element(By.CSS_SELECTOR, "a.dep.iata")
    dest = browser.find_element(By.CSS_SELECTOR, "a.arr.iata")
    flight_number = flight.text
    departure = dep.get_attribute('data-tooltip-value')
    destination = dest.get_attribute('data-tooltip-value')   
    return flight_number, departure, destination

### And here is the main method

Add some missing code lines, where TODO is specified.

In [50]:
def report_flights(browser, center, scale):
    import time

    browser.get(f"https://www.flightradar24.com/{center}/{scale}")
    # wait a page to load
    browser.implicitly_wait(5)
    # wait dynamic elements to load
    time.sleep(5)
    innopolis_px = get_center_point(browser)   
    airports, airplanes = spot_some_air_stuff(browser)
    for element in airports:
        loc = element.location

        # shifts are due to airport figure size
        coord = (element.location['x'] + element.size['width'] // 2, 
                 element.location['y'] + element.size['height'] // 2)
        d = dist(innopolis_px, coord) * scale_km_per_px(scale)
        print(f"Airport {element.get_attribute('aria-label')} is {d:.2f} km away.")
    
    for element in airplanes:
        try:
            # TODO click on the airplane icon (element). See https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.click
            # your code here
            element.click()
            
            # let it render the pane
            time.sleep(1)
            # extract flight info from the pane
            flight_number, departure, destination = get_flight_info(browser)
            # shifts are due to airplane figure size
            coord = (element.location['x'] + element.size['width'] // 2, 
                     element.location['y'] + element.size['height'] // 2)
            d = dist(innopolis_px, coord) * scale_km_per_px(scale)
            message = (f"{flight_number} flies\n\tfrom " + 
                       f"{departure}\n\tto " + 
                       f"{destination}\n\t" + 
                       f"{d:.2f}km far away from Innopolis")
            message = message.replace("<br>", " ")
            print(message)

            # TODO: click on the [x] in the corner of the panel.
            close = browser.find_element(By.CSS_SELECTOR, "a.close-panel")
            close.click()
            # this is an <a> tag with 'close-panel' class

            # NB: Sometimes this can also raise and exception due to occlusion
            # your code here
        except Exception as e:
            pass
            # print(e)

In [52]:
%%time
report_flights(browser, innopolis, 8)

Airport Ulyanovsk Vostochny Airport (ULY/UWLW) is 139.84 km away.
Airport Cheboksary Airport (CSY/UWKS) is 91.08 km away.
Airport Kazan International Airport (KZN/UWKD) is 37.95 km away.
UT363 flies
	from Moscow Vnukovo International Airport (VKO/UUWW)
	to Ufa International Airport (UFA/UWUU)
	132.31km far away from Innopolis
CPU times: user 35.3 ms, sys: 2.43 ms, total: 37.8 ms
Wall time: 15.7 s


### And now we close the browser

In [25]:
browser.quit()

## Headless

Drawing the page explicitly consumes additional resources. Thus, we will run our application with no browser window now!

Browsers (at least [FF](https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Headless_mode), [Chrome](https://intoli.com/blog/running-selenium-with-headless-chrome/), IE) have headless mode - no window rendering and so on. Which means it should work much faster!

In [53]:
options = webdriver.ChromeOptions()
options.add_argument('window-size=1200x600')
options.add_argument('--headless')
browser = webdriver.Chrome(options=options)

In [57]:
%%time
report_flights(browser, innopolis, 8)

Airport Kazan International Airport (KZN/UWKD) is 37.95 km away.
UT363 flies
	from Moscow Vnukovo International Airport (VKO/UUWW)
	to Ufa International Airport (UFA/UWUU)
	151.16km far away from Innopolis
CPU times: user 22.9 ms, sys: 168 µs, total: 23 ms
Wall time: 8.81 s


In [58]:
browser.quit()