# Premium ground vehicles

## Gather URLs from War Thunder wiki

Caches URLs in a text file. At some point this will use S3.

In [None]:
import os
import importlib
import utils

importlib.reload(utils)

urls_filename = os.path.join(
    os.path.abspath(''),
    'data',
    'premium_ground_vehicles',
    'urls.txt'
)

vehicle_urls = await utils.generate_premium_ground_vehicle_urls()
await utils.write_file(urls_filename, "\n".join(vehicle_urls))

## Gather HTML from vehicle URLs

This will cache the HTML from each vehicle's wiki URL so additional data points can be gathered later without additional scraping. At some point this will use S3.

In [1]:
import os
import asyncio
import aiohttp
import importlib
import utils

importlib.reload(utils)

vehicle_data_path = os.path.join('data', 'premium_ground_vehicles')
html_path = os.path.join(os.path.abspath(''), vehicle_data_path, 'html')
urls_filename = os.path.join(os.path.abspath(''), vehicle_data_path, 'urls.txt')
urls = (await utils.read_file(urls_filename)).strip('\n').split('\n')

async with aiohttp.ClientSession() as session:
  await asyncio.gather(*[
    utils.cache_vehicle_html(url, html_path, session=session) for url in urls
  ])

## Extract ground vehicle data points

This extracts data points out of the previously cached HTML files.

Each data point has a corresponding extraction function, as the method to extract each data point can differ slightly. New data points require a corresponding extraction function.

In [None]:
import utils

importlib.reload(utils)


The following vehicle data points will be gathered (tracking with ✅/❌):
- ❌ Vehicle name
- ❌ Country
- ❌ Type of tank
- ❌ Rank
- ❌ BR for each game mode
- ❌ Cost (pull from "Purchase" on page, if exists)
- ❌ Wheels vs treads
- ❌ Hull armor (front/side/back)
- ❌ Turret armor (front/side/back)
- ❌ Crew members
- ❌ Visibility
- ❌ Horizontal guidance
- ❌ Vertical guidance
- ❌ Is amphibious
- ❌ Forward speed (AB)
- ❌ Forward speed (RB/SB)
- ❌ Back speed (AB)
- ❌ Back speed (RB/SB)
- ❌ Engine power (AB)
- ❌ Engine power (RB/SB)
- ❌ Power-to-weight ratio (AB)
- ❌ Power-to-weight ratio (RB/SB)
- ❌ Weight (tons)
- ❌ Repair cost (AB)
- ❌ Repair cost (RB/SB)
- ❌ Crew training
- ❌ Crew training (Expert)
- ❌ Crew training (Aces)
- ❌ Crew training (Research Aces)
- ❌ Modifications list
- ❌ First stage ammunition amount (maybe?)
- ❌ Reload time
- ❌ Max ammo
- ❌ Has stabilizer
- ❌ Fire rate
- ❌ Ammunitions
  - ❌ name
  - ❌ type
  - ❌ pen @ 10m
  - ❌ pen @ 100m
  - ❌ pen @ 500m
  - ❌ pen @ 1000m
  - ❌ pen @ 1500m
  - ❌ pen @ 2000m
  - ❌ projectile velocity
  - ❌ projectile mass
  - ❌ fuse delay
  - ❌ fuse sensitivity
  - ❌ explosive mass
  - ❌ degrees richochet 0% chance 
  - ❌ degrees richochet 50% chance 
  - ❌ degrees richochet 100% chance 
- ❌ coax machine gun caliber
- ❌ has mounted MG