# IOT for Pandemics

This notebook is part of [*Practical Data Science for IOT*](https://github.com/pablodecm/datalab_ml_iot) tutorial by Pablo de Castro

## What can we do? (in addition to staying at home)

Given the current COVID19 pandemic that is currently undergoing (this was initially
written the 25th of March in Spain), it is worth thinking about possible technological
solutions that could help improve or manage this crisis or future pandemics.

<br>

<div align="center">
  <img src="images/data_science_diagram.png" width="40%">
</div>


## Brainstorming Ideas

Let's think about different solutions that could help with the crisis that use IOT technologies and data science, by iterating following this structure:



### WHY

The actual problem or challenge that we are trying to solve. <br>
Described in specific terms (not in general terms).  <br>
Could be sub-problems of a larger problem.

### HOW

Which technologies could be used to address the WHY. <br>
How will these technologies interact. Estimate human and economical costs.  <br>
Sketch the system components and how they play together.

### WHAT

Name or describe the solution. <br>
Could it really address the WHY? <br>
If yes, great! **Get feedback and/or try to build a PoC!** <br>
If not, do not worry, keep iterating!

## Be bold!

**Homework**: let's think individually or in groups about technological solutions of Data Science and IOT that could help with the COVID19 crisis following the previous structure.



Send your ideas with this Google Form and we will discuss them tomorrow in class form:

https://forms.gle/tZJh8hE3vLdwyyQW8

## Extra for tomorrow

Go to  https://takeout.google.com/settings/takeout and download your own Location History data and keep it save, in the exercise tomorrow we will use our own location data. If possible, try to get the location data of someone else to study contract tracing between people.

<div align="center">
  <img src="images/google_takeout.png" width="50%">
</div>

## Example Idea


### WHY

SARS-CoV-2 virus is very contagious, due to the combined effect of a long incubation period, a large fraction of infected people only developing mild symptoms yet still being contagious and a high survivability in surfaces and air droplets.

Uncontrolled transmission in the population can cause rapid growth with an associated large mortality within risk groups and easily overwhelm the health systems. Strong confinement and social distancing  seem the only effective way to stop the rapid spread if it is already out of control.

While country-wide confinement is required in the short-term to stop the current transmission waves, it might not be sustainable long-term from a social and economical perspective (vaccine production at scale it is probably years away and treatments are likely to be of help but not a definite solution).

**Without confinement even with massive testing it is really hard to track the virus transmission chain at scale, i.e. to find out who is likely to be infected by someone that has just tested positive. Better ways to trace transmissions between the population
allow more directed testing campaign and containment by small group confinements.**






### HOW

Modern technology is likely to help with the problem of tracking the transmission chain. We need a way to register potential virus transmissions between people so it could be used to trace the graph of possible infections once someone has tested positive.

This could be done with a powerful surveillance infrastructure, which is lacking in most countries and building it at the required scale is not possible or desired. Alternatively, the solution could be build based on people knowingly carrying personal devices, either voluntarily or enforced in public space until the crisis is controlled.

Large fractions of the world population already own a internet-capable sensor-rich smartphone, and additional smartphone-like devices could be also provided at scale for those that do not.

**A transmission trace system at national or multi-national scale can be based on location (e.g. cell phone tower triangulation or A-GPS logs), close distance between peers (e.g. ultrasound or Bluetooth) or a combination of all these technologies. Data already collected by companies could also be reused. A secure and escalable data collection and analysis infrastructure could be build rapidly in the cloud and managed by a trusted parties**



### WHAT

A contact tracing system based on data collection by smartphones and other personal devices could be build in a short amount of time and could potentially address the problem of tracing the transmission chains, which in turn can be use to direct testing campaigns and enforce small group confinement, keeping most of the societal and economical activity intact.

While promising, there are also unexplored concern regarding data protection and privacy, the different capabilities of the technologies mentioned, whether it could be enforced or a voluntary usage would suffice and how to put together the organisational, human and economical resources for creating an effective solution in a short time.

**The best way to solve some of these uncertainties is to iteratively build proof of concept (PoC) examples of the system. We are gonna do a basic PoC of a tracing system based on Google Location history in the rest of this document.**


### Google Location History PoC

For this part we are gonna use you own Google Location history (and optionally someone elses that has given acce you theirs).

In [None]:
!ls -lrth google_location_history_data/

In [None]:
import zipfile
from pathlib import Path

loc_data_dir = Path("google_location_history_data/")

In [None]:
zip_file_list = list(loc_data_dir.glob("*.zip"))

zip_file_list

In [None]:
zf = zipfile.ZipFile(list(loc_data_dir.glob("*.zip"))[1])
zf.namelist()

In [None]:
import pandas as pd
import io

good_file_names = ['Takeout/Historial de ubicaciones/Historial de ubicaciones.json',
                   'Takeout/Location History/Location History.json']

def load_dataframe_from_file(file_path: Path):
    
    zip_file = zipfile.ZipFile(file_path)
    for file_name in zip_file.namelist():
        if file_name in good_file_names:
            return pd.read_json(io.BytesIO(zip_file.read(file_name)))


In [None]:
df = load_dataframe_from_file(zip_file_list[0])

In [None]:
# try to continue this analysis yourself