# Virus Tracking

The CDC tracks virus cases by state, where each state reports their current count to the CDC.
I want to see the number of cases in Texas as they grow over time, broken down by city if possible.

I have only found 1 website that offers anything like this kind of data, and it only shows the "current" cases:
https://dshs.texas.gov/news/updates.shtm

DSHS is updating their data source every day at noon, so if I add to *my* data source every day at 1pm (rather than replacing the current), I should be able to see change over time.

In [3]:
import requests
from bs4 import BeautifulSoup

response = requests.get("https://dshs.texas.gov/news/updates.shtm", verify=False)
source = BeautifulSoup(response.text, "html.parser")



This part loads the current day's county data from the DSHS website.

In [43]:
import pandas as pd
import datetime as dt

tables = source.find_all("table")
county_table = [t for t in tables if t.has_attr("summary") and t.attrs["summary"] == "COVID-19 Cases in Texas Counties"][0]

row_groups = [tr.find_all("td") for tr in county_table.find_all("tr")][1:-1]

today = dt.datetime.today()

num_cases = [{
    "county": td[0].text,
    "date": f"{today.month}/{today.day}/{today.year}",
    "num_cases": td[1].text
} for td in row_groups]

df_num_cases = pd.DataFrame(num_cases)

Next we're going to add the latitude and longitude of each county to the grid.

In [33]:
import geopy
from geopy.extra.rate_limiter import RateLimiter

locator = geopy.Nominatim(user_agent="myGeocoder")
geocode = RateLimiter(locator.geocode, min_delay_seconds=0.1)

df_num_cases["point"] = (df_num_cases["county"] + ", TX").apply(geocode)

df_num_cases[['latitude', 'longitude']] = pd.DataFrame([(p.latitude, p.longitude) for p in df_num_cases["point"].tolist()])
df_num_cases = df_num_cases.drop(["point"], axis=1)

display(df_num_cases)

Unnamed: 0,county,date,num_cases,latitude,longitude
0,Bell,3/20/2020,2,31.008166,-97.431441
1,Bexar,3/20/2020,12,29.426399,-98.510478
2,Bowie,3/20/2020,1,33.419889,-94.447963
3,Brazoria,3/20/2020,3,29.18161,-95.499337
4,Brazos,3/20/2020,2,30.652157,-96.381114
5,Cameron,3/20/2020,1,26.129119,-97.413428
6,Collin,3/20/2020,12,33.160963,-96.606098
7,Crane,3/20/2020,1,31.448612,-102.516354
8,Dallas,3/20/2020,22,32.776272,-96.796856
9,Denton,3/20/2020,6,33.183879,-97.141342


This map shows the locations where virus has been found, with the size of the circle corresponding to the number of cases as of 3/20/20.

In [42]:
import folium

county_map = folium.Map(
    location=[31.9686, -99.9018],
    tiles='cartodbpositron',
    zoom_start=6,
)

df_num_cases.apply(lambda row: folium.CircleMarker(location=[row["latitude"], row["longitude"]], radius = row["num_cases"]).add_to(county_map), axis=1)

county_map