# AutoPlayer F1TV

## Context: 
I love Formula 1 and from time to time I like to watch older seasons' compacts and championship summary videos on YouTube. Most of those videos, however, were taken down due to Copyright. The solution for me was to sign up on F1TV, where we can find coverage from even older races (70's and 80's included) in the [archive section](https://f1tv.formula1.com/page/493/archive).

## Problem:
There is no playlist to watch videos on F1TV site, and we need to find the links one by one in order to watch the whole season in sequence. Older youtube videos (some extracted from older VHS tapes and DVDs) present the sequence of the race compacts, so I could lay ion bed and watch all the sequence. The web page navigation also went through several improvements recently, but the player is still not good.

## First solution:
After a brief reddit search, I discovered [this website](https://g-hartmann.github.io/F1TVlinks/) that presents a fast way to find the videos I want. From the website, we can copy the link to the clipboard and watch... however the login and password are easily lost during the navigation, so I needed frequently to login again. Also, we needed to open one by one, click on play, then fullscreen... many steps.

## Second solution:
I had a brief knowledge of the Selenium lib (used for interface testing and data scrapping from websites). With the browser interaction commands from Selenium, I started to think about ways to automatize browser navigation in order to open the videos one by one so I can watch the whole season review without getting off bed. This was also a nice way to learn about automatic browser interaction and navigation using the Selenium lib :)

So, my problem was: *could I automatize browser navigation simulating the creation of a F1TV playlist using Python and Selenium?*

With the scope defined, I went for mapping the interaction flow in order to open a video on the site player (including elements and wait times), researched about cookies and session registration for f1tv login, found out how to use info copied on the clipboard by the g-hartmann website and developed a solution able to solve this small problem!

The Jupyter notebook presented in this folder details the prototipation process step by step, given as parameters the year you want to watch and the number of the race you want to start :)

**USAGE**:
>    Configure .env with credentials (login and password for f1TV)

>    Execute python script: `python3.9 watch-f1tv.py <year> <race number>`



**REQUIREMENTS**
* Google Chrome
* Selenium==4.1.3
* chromedriver_autoinstaller==0.3.1
* python-dotenv==0.13.0
* undetected-chromedriver==3.1.5post4

**USAGE**
- Fill Login and Password on .env file
- Modify second cell with year and race number parameters
- Run all cells from Jupyter Notebook

**Step 1:** Import Libs

In [None]:
from selenium.webdriver.support.ui import Select, WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.chrome.service import Service
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import chromedriver_autoinstaller
from time import sleep
import sys
import os

from selenium_stealth import stealth

import clipboard
import urllib.request
import undetected_chromedriver.v2 as uc
from dotenv import load_dotenv

**Step 2:** Load credentials from .env file, start arguments with the desired season and initial grand prix.

In [None]:
load_dotenv()
f1tv_login = os.getenv("F1TV_CREDENTIAL_LOGIN")
f1tv_password = os.getenv("F1TV_CREDENTIAL_PASSWORD")

season = '2016'
initialGP = 0

**Step 3:** Definition of the initialization function for the webdriver object.

* The first version of our script used the selenium webdriver conventional object.
At first, it worked well. However, the F1TV website evolved with a new login platform, that was able to detect that the browser was controlled by an automation software.

* Thus, the flow mapped was stopped in the login webpage, whose request was denied.

* Inspecting the network page, I discovered that the website really detected as a bot. Using the [bot sannysoft website](https://bot.sannysoft.com/) we detected that the browser configurations were different. So we tried to find alternatives to login using the automated browser.

* One possibility was to change parameters like useAutomationExtension:

```python
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
```

* This did not work out. Searching for cues on stackOverflow, I found the library [selenium-stealth](https://github.com/diprajpatra/selenium-stealth), that configures the webdriver object altering properties

```python
stealth(wb,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
       )
```

* Another try, another fail that had me look around for alternatives. In [this text](https://piprogramming.org/articles/How-to-make-Selenium-undetectable-and-stealth--7-Ways-to-hide-your-Bot-Automation-from-Detection-0000000017.html) the library [undetected-chromedriver](https://github.com/ultrafunkamsterdam/undetected-chromedriver/) is mentioned, that creates a new WebDriver similar to Selenium, easing the driver version specification and settings.

* During tests, I found out that we needed to perform the install of the driver in each execution with the lib [chromedriver_autoinstaller](https://github.com/yeongbin-jo/python-chromedriver-autoinstaller) that automatizes chromedriver download and config.

```python
    chromedriver_autoinstaller.install()
    wb = uc.Chrome()
```

* With this library, the settings on bot detection website were exactly equal to ones in a non-automated environment, and we were able to pass through the login page :)


In [None]:

def initialize_webdriver(headless = True):    
    ''''
    Return selenium webdriver (undetected_chromedriver) object.
    '''

    chromedriver_autoinstaller.install()
    wb = uc.Chrome()
    
    return wb

**Step 4**: Init webdriver and performs the flow!

1. Access the F1TV website
2. Click on cookie accept button
3. Click on the login button
4. On login page, send keys to login and password fields and press the login button
5. Now logged in, open the desired season's highlight archive link, getting the website elements for each GP
    * This was a nice change they did on the website! Before, in order to access older races we needed to open one by one using links generated in another website ([g-hartmann](https://g-hartmann.github.io/F1TVlinks/)) and accessing links by interacting with the dropdown menus.
    ```python
gpNr = 0
wb2 = initialize_webdriver(headless=False)
wb2.get('https://g-hartmann.github.io/F1TVlinks/')
wait2 = WebDriverWait(wb2, 50)
wait2.until(EC.visibility_of_all_elements_located((By.ID, 'year_dropdown')))
year = Select(wb2.find_element_by_id('year_dropdown'))
year.select_by_value(season)
gp = wb2.find_element_by_id('gp_dropdown')
options = [x for x in gp.find_elements_by_tag_name("option")]
element = options[gpNr]
select = Select(gp)
select.select_by_value(element.get_attribute("value"))
btnlink = wb2.find_element_by_id('genButton')
btnlink.click()
wb.get(clipboard.paste())
    ```
    * With the new interface, we can navigate to the season and access the GP elements that are sorted without recurring to resources other than the website.
        ```python
    url_season = f'https://f1tv.formula1.com/search/?filter_objectSubtype=Highlights&orderBy=meeting_Number&sortOrder=asc&filter_year={season}&filter_orderByFom=Y&title=The%20Story%20of%20the%20Season&pageID=967_63'
wb.get(url_season)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'video-card-item-metadata-container')))
elements = wb.find_elements(by=By.CLASS_NAME, value='video-card-item-metadata-container')
    ```
6. For each element:
    * click to open, 
    * wait for the video element to appear
    * click on play
    * wait for the full screen player button element to appear
    * click full screen
    * check current play timestamp in comparison with total time
    * if they are equal, the video ended.
    * when the video ends, click on replay button, then after the fullscreen button reappears click on fullscreen to toggle off
    * return to season URL
    * select next element


In [None]:
# 1. Access the F1TV website
#Initialize Chrome window for f1tv
wb = initialize_webdriver(headless = False)
wb.get('https://f1tv.formula1.com/')

In [None]:
# 2. Click on cookie accept
btn_cookie = wb.find_element(by=By.XPATH, value='/html/body/div[2]/div/div/div[2]/div[3]/div[2]/button[2]')
btn_cookie.click()

In [None]:
# 3. Click on the login button
xpath_loginpg = '/html/body/div[1]/div/div/div/div[1]/div/div/div/div/div/a'
wait = WebDriverWait(wb, 10)
wait.until(EC.visibility_of_element_located((By.XPATH,xpath_loginpg)))

loginpg = wb.find_element(by=By.XPATH, value=xpath_loginpg)
loginpg.click()

In [None]:
# 4. On login page, send keys to login and password fields and press the login button
loginTxt = wb.find_element(by=By.CLASS_NAME, value='txtLogin')
password = wb.find_element(by=By.CLASS_NAME, value='txtPassword')
loginTxt.send_keys(f1tv_login)
password.send_keys(f1tv_password)
loginBtn_xpath = '/html/body/div[2]/main/div/div[3]/div/form/div[4]/button'
wait.until(EC.visibility_of_element_located((By.XPATH, loginBtn_xpath)))
btn = wb.find_element(by=By.XPATH, value=loginBtn_xpath)
btn.click()

In [None]:
# 5. Now logged in, open the desired season's highlight archive link, getting the website elements for each GP
url_season = f'https://f1tv.formula1.com/search/?filter_objectSubtype=Highlights&orderBy=meeting_Number&sortOrder=asc&filter_year={season}&filter_orderByFom=Y&title=The%20Story%20of%20the%20Season&pageID=967_63'
wb.get(url_season)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'video-card-item-metadata-container')))
elements = wb.find_elements(by=By.CLASS_NAME, value='video-card-item-metadata-container')


In [None]:
import time
url_season = f'https://f1tv.formula1.com/search/?filter_objectSubtype=Highlights&orderBy=meeting_Number&sortOrder=asc&filter_year={season}&filter_orderByFom=Y&title=The%20Story%20of%20the%20Season&pageID=967_63'
wb.get(url_season)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'video-card-item-metadata-container')))

elements = wb.find_elements(by=By.CLASS_NAME, value='video-card-item-metadata-container')
race_numbers = len(elements)

current_race = initialGP
# 6. For each element:
while current_race < race_numbers:
    element = elements[current_race]
    wait = WebDriverWait(wb, 10)
    #click to open
    element.click()
    
    #wait for the video element to appear
    wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'inset-video-item-play-action-container')))
    video = wb.find_element(by=By.CLASS_NAME, value='inset-video-item-play-button')
    
    #click on play
    video.click()

    #wait for the full screen player button element to appear
    wait = WebDriverWait(wb, 10)
    wait.until(EC.visibility_of_element_located((By.ID, 'bmpui-id-76')))
    
    #click full screen
    btnFullScreen = wb.find_element(by=By.ID, value='bmpui-id-76')
    btnFullScreen.click()

    
    video_is_playing = True
    while video_is_playing:
        time.sleep(10)
        
        #check current play timestamp in comparison with total time
        progress = wb.find_element(by=By.ID, value='bmpui-id-57')
        progresstotal = wb.find_element(by=By.ID, value='bmpui-id-65')
        print(progress.get_attribute('innerHTML'), progresstotal.get_attribute('innerHTML'))
        
        #if they are equal, the video ended.
        video_is_playing = (progress.get_attribute('innerHTML') != progresstotal.get_attribute('innerHTML'))
    #when the video ends, click on replay button, then after the fullscreen button reappears click on fullscreen to toggle off
    wb.find_element(by=By.ID, value='bmpui-id-91').click()
    wait = WebDriverWait(wb, 10)
    wait.until(EC.visibility_of_element_located((By.ID, 'bmpui-id-76')))

    btnFullScreen.click()
    #return to season URL
    wb.get(url_season)
    wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'video-card-item-metadata-container')))
    #select next element
    elements = wb.find_elements(by=By.CLASS_NAME, value='video-card-item-metadata-container')
    current_race += 1

### Wrapping it all
So with this small code snippet, we demonstrate some basic selenium navigation concepts such as:
* Finding elements by different cryteria:
```python
wb.find_element(by=By.XPATH, value='/html/body/div[2]/div/div/div[2]/div[3]/div[2]/button[2]')
wb.find_element(by=By.ID, value='bmpui-id-65')
wb.find_elements(by=By.CLASS_NAME, value='video-card-item-metadata-container')
```
* Waiting until elements are visible:
```python
wait = WebDriverWait(wb, 10)
wait.until(EC.visibility_of_element_located((By.ID, 'bmpui-id-76')))
```
* Getting HTML values and using it in decision taking:
```python
progress.get_attribute('innerHTML')
video_is_playing = (progress.get_attribute('innerHTML') != progresstotal.get_attribute('innerHTML'))
```
* Sending text to text fields on forms:
```python
loginTxt = wb.find_element(by=By.CLASS_NAME, value='txtLogin')
loginTxt.send_keys(f1tv_login)
```
* Loading content from .env:
```python
from dotenv import load_dotenv
load_dotenv()
f1tv_login = os.getenv("F1TV_CREDENTIAL_LOGIN")
f1tv_password = os.getenv("F1TV_CREDENTIAL_PASSWORD")
```
* Loading content from .env:
```python
from dotenv import load_dotenv
load_dotenv()
f1tv_login = os.getenv("F1TV_CREDENTIAL_LOGIN")
f1tv_password = os.getenv("F1TV_CREDENTIAL_PASSWORD")
```
* In the older snippet, how to manage content on the clipboard:
```python
import clipboard
clipboard.paste()
```
* Alternatives to selenium to prevent bot detection (detailed in previous sections)

So I think this small project helped me a lot understanding basic concepts and functionalities in the scrapping and web automation, and I had a lot of fun making it and learning about stuff, and the playlist works beautifully :)