# Flightradar analysis

In this task we will use https://www.flightradar24.com/ website to monitor airplanes near Innopolis. For this we will use some information from URL, and from moving airplane items. Let's have fun with airplanes!

**NB:** This lab is designed to be executed **locally** at your laptop, as it launches local application (browser). Indeed, headless mode can be used in colab, but this would also require specific browser installation steps. Thus, please use Anaconda.

## Dependency installation

Let's try to load and parse the page the way we did before:

In [64]:
import requests
from bs4 import BeautifulSoup
resp = requests.get("https://www.flightradar24.com/")
print("Status:", resp.status_code)

Status: 451


Wowowow! https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/451 :

```
The HyperText Transfer Protocol (HTTP) '451 Unavailable For Legal Reasons' client error response code indicates that the user requested a resource that is not available due to legal reasons, such as a web page for which a legal action has been issued.
```

As we see, the output is not what we would expect. So, what can we do when a page is not being loaded right away, but is rather rendered by a script, and only in a valid browser? Browser engines can help us getting the data. Let's try to load the same web page, but do it in a different way: let's give a browser some time to load the scripts and run them. And then we will work with DOM (Document Object Model), but we will obtain this DOM from the browser engine itself, not via `BeautifulSoup`.

Where do we get browser engine from? Simply installing a browser will do the thing. How do we send commands to it from the code, and retrieve the DOM? Service applications called `drivers` will interpret commands and translate them into browser actions.

For each supported browser engine you will need to:
1. install browser itself;
2. download 'driver' - binary executable, which passed commands from selenium to browser. E.g. [Gecko = Firefox](https://github.com/mozilla/geckodriver/releases), [ChromeDriver](http://chromedriver.storage.googleapis.com/index.html);
3. unpack driver into a folder under PATH environment variable. Or specify exact binary location when you write the code.

### Download driver

And place it in any folder or under PATH env. variable. [Firefox](https://github.com/mozilla/geckodriver/releases), [Chrome](http://chromedriver.storage.googleapis.com/index.html)

### Install selenium

Selenium is a powerful tool for automated UI testing. We will use it to emulate used actions with the website.

In [65]:
# !pip install -U selenium

Check it works

In [66]:
from selenium import webdriver

### Launch browser

This will open a browser window

In [67]:
browser = webdriver.Chrome()
# or explicitly
# browser = webdriver.Firefox(
#     executable_path='C:/bin/geckodriver.exe', 
#     firefox_binary='C:/Program Files/Mozilla Firefox/firefox.exe'
# )

### Download the page ... again

In [68]:
from selenium.webdriver.common.by import By

# navigate to page
browser.get('https://www.flightradar24.com/')
browser.implicitly_wait(10)  # wait for 10 seconds

# select all visible airplanes from document
elements = browser.find_elements(By.CSS_SELECTOR, "div[role=button]")
# note that if number differs from launch to launch this means better extend wait time
print("Elements found:", len(elements))

Elements found: 90


### Preparatory functions

We will center our map around Innopolis, and choose one of the suitable scales.

In [69]:
innopolis = "55.75,48.75"
scale = 9


def scale_km_per_px(scale):
    return 2 ** 8 / 3 / (2 ** scale)


def dist(a, b):
    return ((a[0] - b[0]) ** 2 + (a[1] - b[1]) ** 2) ** .5

## Solving the problem

### Obtain center coordinates

First task is to get pixel coordinates of the screen center. You are given a browser instance object. And we are interested, what is the size of the rendered page (NB not the window!)? For this you will do the following:
1. find the root `html` tag by tag name. Refer of [`find_element` documentation](https://selenium-python.readthedocs.io/locating-elements.html) and [`By` options](https://www.selenium.dev/selenium/docs/api/py/webdriver/selenium.webdriver.common.by.html).
2. Extract **attribute** values of this tag. We are interested in `clientWidth` and `clientHeight`. [See this doc](https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.get_attribute) for usage.
3. Divide these values by 2 and return as a tuple.

In [70]:
def get_center_point(browser):
    html = browser.find_element(By.TAG_NAME, "html")
    inner_width = int(html.get_attribute("clientWidth"))
    inner_height = int(html.get_attribute("clientHeight"))
    # in center
    innopolis_px = (inner_width / 2, inner_height / 2)
    return innopolis_px

### Catching the airplane

This code will search for airplane and airport images and their coordinates on the map. Your task is to complete the check if this icon is an airport, or an airplane.

Airport example:

```
<div style="width: 20px; height: 20px; overflow: hidden; position: absolute; cursor: pointer; touch-action: none; left: 131px; top: -89px; z-index: 1090430;" title="Yoshkar-Ola Airport (JOK/UWKJ)" aria-label="Yoshkar-Ola Airport (JOK/UWKJ)" role="button" tabindex="0">...</div>
```

Airplane example:

```
<div style="width: 33px; height: 33px; overflow: hidden; position: absolute; cursor: pointer; touch-action: none; left: -30px; top: 17px; z-index: 1031004;" title="" role="button" tabindex="-1">...</div>
```

Again, I think [get_attribute(...) call](https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.get_attribute) can help in distinguishing these.

In [71]:
def spot_some_air_stuff(browser):
    # these are all the elements, corresponding to the desired filter
    elements = browser.find_elements(By.CSS_SELECTOR, "div[role=button][tabindex='-1']")
    airports = []
    airplanes = []
    
    for element in elements:
        aria = element.get_attribute('aria-label')# ... this is where you choose whether this is an airport or an airplane
        if aria:
            airports.append(element)
        else:
            airplanes.append(element)
    return airports, airplanes

### Get the info from the pane

When we click on the airplane image, a side pane appears. We will read the info from this pane.

In [72]:
def get_flight_info(browser):
    flight = browser.find_element(By.CSS_SELECTOR, 'h2.airline-info__flight-no')
    dep = browser.find_element(By.CSS_SELECTOR, "a.dep.iata")
    dest = browser.find_element(By.CSS_SELECTOR, "a.arr.iata")
    flight_number = flight.text
    departure = dep.get_attribute('data-tooltip-value')
    destination = dest.get_attribute('data-tooltip-value')   
    return flight_number, departure, destination

### And here is the main method

Add some missing code lines, where TODO is specified.

In [73]:
def report_flights(browser, center, scale):
    import time

    browser.get(f"https://www.flightradar24.com/{center}/{scale}")
    # wait a page to load
    browser.implicitly_wait(5)
    # wait dynamic elements to load
    time.sleep(5)
    innopolis_px = get_center_point(browser)   
    airports, airplanes = spot_some_air_stuff(browser)
    for element in airports:
        loc = element.location

        # shifts are due to airport figure size
        coord = (element.location['x'] + element.size['width'] // 2, 
                 element.location['y'] + element.size['height'] // 2)
        d = dist(innopolis_px, coord) * scale_km_per_px(scale)
        print(f"Airport {element.get_attribute('aria-label')} is {d:.2f} km away.")
    
    for element in airplanes:
        try:
            # TODO click on the airplane icon (element). See https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webelement.WebElement.click
            element.click()
            
            # let it render the pane
            time.sleep(1)
            # extract flight info from the pane
            flight_number, departure, destination = get_flight_info(browser)
            # shifts are due to airplane figure size
            coord = (element.location['x'] + element.size['width'] // 2, 
                     element.location['y'] + element.size['height'] // 2)
            d = dist(innopolis_px, coord) * scale_km_per_px(scale)
            message = (f"{flight_number} flies\n\tfrom " + 
                       f"{departure}\n\tto " + 
                       f"{destination}\n\t" + 
                       f"{d:.2f}km far away from Innopolis")
            message = message.replace("<br>", " ")
            print(message)

            # TODO: click on the [x] in the corner of the panel.
            # this is an <a> tag with 'close-panel' class
            # NB: Sometimes this can also raise and exception due to occlusion
            # your code here
            browser.find_element(By.CSS_SELECTOR, "a.close-panel").click()
        except Exception as e:
            pass
            # print(e)

In [74]:
%%time
report_flights(browser, innopolis, 8)

Airport Ulyanovsk Vostochny Airport (ULY/UWLW) is 139.68 km away.
Airport Cheboksary Airport (CSY/UWKS) is 91.01 km away.
Airport Kazan International Airport (KZN/UWKD) is 37.84 km away.
SU1231 flies
	from Ufa International Airport (UFA/UWUU)
	to Moscow Sheremetyevo International Airport (SVO/UUEE)
	46.61km far away from Innopolis
SU1191 flies
	from Kazan International Airport (KZN/UWKD)
	to Moscow Sheremetyevo International Airport (SVO/UUEE)
	25.18km far away from Innopolis
SU6076 flies
	from Ulyanovsk Baratayevka Airport (ULV/UWLL)
	to Moscow Sheremetyevo International Airport (SVO/UUEE)
	190.98km far away from Innopolis
CPU times: total: 46.9 ms
Wall time: 20.7 s


### And now we close the browser

In [75]:
browser.quit()

## Headless

Drawing the page explicitly consumes additional resources. Thus, we will run our application with no browser window now!

Browsers (at least [FF](https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Headless_mode), [Chrome](https://intoli.com/blog/running-selenium-with-headless-chrome/), IE) have headless mode - no window rendering and so on. Which means it should work much faster!

In [77]:
# options = webdriver.FirefoxOptions()
options = webdriver.ChromeOptions()
options.headless = True
# options.add_argument('headless') -- old version of syntax
options.add_argument('window-size=1200x600')
browser = webdriver.Chrome(options=options)

  options.headless = True


In [78]:
%%time
report_flights(browser, innopolis, 8)

Airport Kazan International Airport (KZN/UWKD) is 37.95 km away.
CPU times: total: 46.9 ms
Wall time: 16.1 s


In [79]:
browser.quit()