# IBM Data Science Professional Certificate - Capstone project
For the Capstone Project of the [IBM Data Science Professional Certificate](https://www.coursera.org/professional-certificates/ibm-data-science), we were asked to clearly define a problem or an idea of our choice, where we would need to leverage the Foursquare location data to solve a problem.

## 1. Business Problem

### 1.1 COVID in Brazil
The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 
When this notebook was written, more than 180 thousand people had lost their lives to COVID-19 in Brazil. As of December 15th, 2020, Brazil had not detailed its coronavirus immunization plan and [did little to assuage concerns that the government is stumbling in its efforts to inoculate 212 million people, with no timeline and vague supply agreements.](https://www.bloomberg.com/news/articles/2020-12-14/brazil-s-vaccination-plan-elicits-more-questions-than-answers)

Brazil occupies half the continent's landmass. [It is the fifth largest country in the world, with an area greater than that of the 48 conterminous U.S. states.](https://www.britannica.com/place/Brazil) Brazil stretches roughly 2,700 miles (4,350 km) from north to south and from east to west to form a vast irregular triangle that encompasses a wide range of tropical and subtropical landscapes, including wetlands, savannas, plateaus, and low mountains. With such a continental country, the healthcare coverage and distribution is highly irregular.

Concentrated in the largest cities and most developed regions, there are few hospitals and beds in the poorer regions. [In fact, patients from 43% of the country's cities will need to travel if they have serious Covid-19 symptoms](https://oglobo.globo.com/sociedade/coronavirus-menos-de-um-quarto-dos-municipios-brasileiros-tem-leitos-de-uti-que-atendem-pelo-sus-24353614). This is because only 53.1% of the 5,571 Brazilian municipalities have hospitals with hospitalization beds by the public system - in general, they are larger cities, elected as regional health centers to meet the demand of all patients.

### 1.2 Healthcare workers
Frontline health and social care workers [are at increased personal risk of exposure to infection with COVID-19](https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(20)30164-X/fulltext) and of transmitting that infection to susceptible and vulnerable patients in health and social care settings.

In December 2020, the United Kingdom was the first country in the world to roll out the Pfizer-BioNTech COVID-19 Vaccine for emergency use. The UK committee [considered frontline health and social care workers who provide care to vulnerable people a high priority for vaccination](https://www.gov.uk/government/publications/priority-groups-for-coronavirus-covid-19-vaccination-advice-from-the-jcvi-2-december-2020/priority-groups-for-coronavirus-covid-19-vaccination-advice-from-the-jcvi-2-december-2020). This prioritisation was taken into account during vaccine deployment.

In Brazil, unlike hospitalization beds, the healthcare workforce is more evenly distributed among all states, in thousands of municipalities. They work mainly in healthcare centres, including clinics, doctor's offices, etc. Not all of them are located in cities with available hospitals.

### 1.3 Notebook goal
This notebook aims to use *Foursquare location data* combined with the *number of healthcare workers by municipality* and *GPS data* to suggest **the closest cities with hospitals that could be used as vaccination centers for those professionals** and, because they are one of the first groups to receive the vaccine, for treatment in cases of anaphylactoid reaction or other complications. This notebook will also display a dataframe with the number of people that would be immunized in each city to help with transportation and logistics.

### 1.4 Disclaimer
This notebook was created as part of a Data Analysis course. It is publicly available, as required, but for academic purposes only and must not be used to make actual decisions. The COVID-19 pandemic is a serious matter and lives are at stake. In particular, the Foursquare data may not be accurate (although its use was mandatory for this activity) and a lot of simplifications were taken in this **learning exercise**, including but not limited to the fact that we are mapping hospitals instead of using the more than [36,000 vaccination rooms](https://www.scielo.br/pdf/ress/v28n2/en_2237-9622-ress-28-02-e20190223.pdf) in Brazil. For actual vaccine deployment plans, please refer to the Brazilian [Ministry of Health](https://www.gov.br/saude/pt-br).

## 2. Data
1. [Foursquare API](https://developer.foursquare.com/developer/) - Will be used to find the closest hospitals to the city center.
1. [The National Registry of Health Facilities (CNES)](http://www2.datasus.gov.br/DATASUS/index.php?area=0204&id=11673) (In portuguese) - Includes number of healthcare workers by municipality.
1. [Municípios Brasileiros](https://github.com/kelvins/Municipios-Brasileiros) (In portuguese) - Latitude and longitude of Brazilian municipalities. Has 5,570 entries

This notebook aims to use the above data to suggest the closest hospitals that could be used as vaccination centers for those professionals. This notebook will also display a dataframe with the number of people that would be immunized in each hospital to help with planning the vaccine transportation and logistics.

Example:
 1. Using CNES (datasource 2), we see that there are 1046 healthcare professionals in *Ariquemes* (a Brazilian Municipality)
 1. Using Foursquare (datasource 1) and MB (datasource 3), we see that there is at least 1 hospital in the city. We set 1046 as the number of people that would be immunized there.
 1. Using CNES, we see that there are 30 healthcare professionals in *Rio Crespo*
 1. Using Foursquare, we find no hospitals in the city. 
 1. Using MB (datasource 3), we get the *Rio Crespo* latitude and longitude. Using this location, we find that the nearest city with hospital is *Ariquemes*. We add 30 to the number of people that would be immunized there, increasing this number to 1076.
 1. We repeat that for the whole country and all available municipalities. In the end, we would have a map and a dataframe with the necessary information