# Some tabulations and visualizations relevant to cancer epidemiology

## Tables for immediate use

### Cancer statistics by "metropolitan statistical area"

The BiocYES package includes `woncan` a table derived from the CDC WONDER data service.  The table provides summaries of cancer incidence over 1999-2018.

In [1]:
#remove.packages("AnVIL")
#BiocManager::install("AnVIL", update=FALSE, ask=FALSE)
ii = rownames(installed.packages())
needed = c("leaflet", "DT", "dplyr")
toinst = setdiff(needed, ii)
if (length(toinst)>0) AnVIL::install(toinst, update=FALSE, ask=FALSE)

In [2]:
if (!("BiocYES" %in% ii)) BiocManager::install("vjcitn/BiocYES", ask=FALSE, update=FALSE)

In [3]:
suppressWarnings({suppressMessages({suppressPackageStartupMessages({
    library(BiocYES)
    library(DT)
    library(dplyr)
    })})
data(woncan)
datatable(woncan |> filter(MSA != "Other") |> select(-Population))
})

“It seems your data is too big for client-side DataTables. You may consider server-side processing: https://rstudio.github.io/DT/server.html”


### Exercises

- What is the age adjusted rate of pancreatic cancer per 100000 residents in the Boston metropolitan statistical area (MSA)?

- What MSA has the lowest recorded rate of female breast cancer (1999-2018)? (Search for code 26000-Female.)

### Massachusetts data

A function has been provided to produce tables of breast or prostate cancer incidence in Massachusetts counties.


In [4]:
datatable(MA_cancer_rate_table("breast"))

### Exercises

- Are the reported values for breast cancer incidence 2014-2018 in Massachusetts counties compatible with the WONDER 1999-2018 report for the Boston MSA?

- Compare the county-level rates for prostate cancer in MA with the WONDER rate reported for the Boston MSA.

## Interactive maps

- Twenty years ago the most common Geographic Information System was the paper map or road atlas
- Now our cell-phones can ask the internet how to get to where we want to go, efficiently
- Understanding how cancer events unfold in different geographic regions is important for public health
    - Are there important environmental hazards at specific locations?
    - Are there clues to genetic origins of particular cancers?
    - Are culturally shared behaviors leading to increased risk? 
- Even though we are comfortable with annotated maps, creating and using "cancer maps" to reason about cancer risk requires some training
- In this notebook we will work with some interactive maps on the web, and we will produce some maps using R programming

## Exercise 1

Use the [International Agency for Research on Cancer (IARC) map tool](https://gco.iarc.fr/today/online-analysis-map?v=2020&mode=population&mode_population=continents&population=900&populations=900&key=asr&sex=0&cancer=39&type=0&statistic=5&prevalence=0&population_group=0&ages_group%5B%5D=0&ages_group%5B%5D=17&nb_items=10&group_cancer=1&include_nmsc=0&include_nmsc_other=0&projection=natural-earth&color_palette=default&map_scale=quantile&map_nb_colors=5&continent=0&show_ranking=0&rotate=%255B10%252C0%255D)
to survey mortality from cancer in 2020 for individuals aged 10-24.  You should see something like the display below.
    

![abc](https://storage.googleapis.com/bioc-anvil-images/IARCoverall.jpg)

True or False: Age standardized mortality from cancer in 2020 for persons aged 10-24 is greater in Vietnam than in neighboring countries.

## Exercise 2

Use the IARC map tool to produce a worldwide map of breast cancer incidence for women aged 60-79.

What is the Scandinavian country with largest estimate of age-standardized breast cancer incidence for women aged 60-79?

## Creating a map

We've provided some software that helps you make interactive maps in Jupyter.

In [5]:
library(BiocYES)

Once you have run the library() command above, you can use `mass_map()` to produce an interactive map:

In [6]:
mass_map()

The map starts out with a focus on the Boston area.  You can point and click to move the focus of the map, or use the +/- control at the top to zoom in or out.

# Adding cancer statistics to the map

We have a table of age-adjusted rates of breast and prostate cancer

In [9]:
brtab = MA_cancer_rate_table(site="breast")
head(brtab, 2)

Unnamed: 0_level_0,County,Cancer.Type,Year,Age.Adjusted.Rate,lci,uci,Case.Count,Population
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<int>,<int>
1,Franklin County,Female Breast,2014-2018,112.1,99.3,126.4,317,181356
2,Suffolk County,Female Breast,2014-2018,115.5,110.8,120.4,2327,2041877


Here is code that produces a table that combines county latitude and longitude measures with the age-adjusted breast cancer rates.

In [11]:
data(us_county_geo)
lj = left_join(mutate(brtab, county=County),   # mutate: obtain new variable name
               filter(us_county_geo, state=="MA"), by="county") # merge rates and geography
lj$lng = sapply(lj$geometry, "[", 1) # "geometry" is a special structure
lj$lat = sapply(lj$geometry, "[", 2) # need to peel apart latitude and longitude
lj$aarat = lj$Age.Adjusted.Rate   # shorter name
head(lj,2)

Unnamed: 0_level_0,County,Cancer.Type,Year,Age.Adjusted.Rate,lci,uci,Case.Count,Population,county,state,fips,ansicode,area_land,area_water,area_land_sqmi,area_water_sqmi,geometry,lng,lat,aarat
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<POINT>,<dbl>,<dbl>,<dbl>
1,Franklin County,Female Breast,2014-2018,112.1,99.3,126.4,317,181356,Franklin County,MA,25011,606932,1810916209,65604206,699.199,25.33,POINT (-72.59179 42.5845),-72.59179,42.5845,112.1
2,Suffolk County,Female Breast,2014-2018,115.5,110.8,120.4,2327,2041877,Suffolk County,MA,25025,606939,150863059,160514693,58.249,61.975,POINT (-71.01825 42.33855),-71.01825,42.33855,115.5


Now we use the leaflet function `addAwesomeMarkers` with our latitude and longitude to enhance
the map.

In [10]:
mass_map() |>                     # need some HTML to format popup
    leaflet::addAwesomeMarkers(lat=lj$lat, lng=lj$lng, 
                               popup=paste(lj$Cancer.Type[1], "<br>", lj$county, "<br>", lj$aarat, sep=""))

## Exercise 3

- Our presentation of the table `lj` above is not interactive.  How can you make it interactive?

- How does the map help you to think about patterns of breast cancer incidence in the counties of Massachusetts?

- Obtain the table for age-adjusted rates for prostate cancer in MA in 2014-2018, and modify the map to present those statistics.  Do you have any comments about patterns of prostate cancer incidence in Massachusetts counties?

