# Preparation of Weather Data
The following notebook demonstrates how weather data was scraped from airdensity online. A site that provides free to use weather data for corresponding race tracks. In this instance, Formula 1 circuits.

The site can be found [here.]('https://airdensityonline.com/tracks/')

## Imports & Helper Methods

### Import Modules

In [275]:
import requests
import csv
from bs4 import BeautifulSoup as bs
import os

### 12 Hour to 24 Hour Time Fuction
Converts the scraped time format into more readble 24-hour time.

In [221]:
# Function to convert 12-hour time format to 24-hour
def convert24(raw): 
      
    # Checking if last two elements of time 
    # is AM and first two elements are 12 
    if raw[-2:] == "am" and raw[:2] == "12": 
        return "00" + raw[2:-3] + ":00"
    
    # Check for e.g. 10:00 am, 11:00 am
    elif raw[-2:] == "am" and (raw[:2] == "11" or raw[:2] == "10"):
        return raw[:-3] + ":00"
        
    # remove the AM     
    elif raw[-2:] == "am": 
        return "0" + raw[:-3] + ":00"
      
    # Checking if last two elements of time 
    # is PM and first two elements are 12    
    elif raw[-2:] == "pm" and raw[:2] == "12": 
        return raw[:5] + ":00"
          
    elif raw[2] == ":": 
        # add 12 to hours and remove PM 
        return str(int(raw[:2]) + 12) + raw[2:5] + ":00"
    
    else: 
        # add 12 to hours and remove PM 
        return str(int(raw[:1]) + 12) + ":" + raw[2:4] + ":00"

### Fahrenheit to Celsius Function
Changes temp from F to C.

In [253]:
def f_to_c(temp):
    temp = (int(temp) - 32) * 5/9
    return round(temp)

## Scrape Website Weather Data
Here we scrape the data from the weather website, modifying certain attributes for easier processing.
You are able to specify the url of the site, and path you want to save the file for.

**Note: This only works for links such as https://airdensityonline.com/track-history/Melbourne_Grand_Prix_Circuit/2020-03-21/**

In [274]:
def scrape_weather(url, path):
    # Prepare response raw data
    response = requests.get(url)
    soup = bs(response.content, 'html.parser')
    rows = soup.find(class_ = 'forecastdata').find_all('ul', recursive=False)[1:]

    # Create empty csv for writing
    file = open('data/' + path, 'w', newline='')
    writer = csv.writer(file)
    writer.writerow(['time', 'temp', 'humidity', 'barometer', 'dew_point', 'grains', 'wind', 'air_density', 'density_alt'])

    # Iterate over rows and scrape data
    for row in rows:
        time = row.find_all('li')[0].text.strip()[7:]
        temp = row.find_all('li')[1].text.strip()[:-6]
        humidity = row.find_all('li')[2].text.strip()[:-1]
        barometer = row.find_all('li')[3].text.strip()[:-3]
        dew_point = row.find_all('li')[4].text.strip()[:-6]
        grains = row.find_all('li')[5].text.strip()
        wind = row.find_all('li')[6].text.strip()
        air_density = row.find_all('li')[7].text.strip()
        density_alt = row.find_all('li')[8].text.strip()

        # Convert particular data
        time = convert24(time)
        temp = f_to_c(temp)
        dew_point = f_to_c(dew_point)

        writer.writerow([time, temp, humidity, barometer, dew_point, grains, wind, air_density, density_alt])
    
    file.close()
    return "File Generated. Size: " + str(os.stat('data/' + path).st_size)

## 

In [273]:
scrape_weather('https://airdensityonline.com/track-history/Melbourne_Grand_Prix_Circuit/2020-03-21/', 'weather/test.csv')

'File Generated. Size: 1390'