# Chapter 2: Digital Contact Tracing - Source Detection Simulator

## Objective

The simulator aims to achieve two primary objectives: 
1) Validate the theoretical detection probability of epidemic centrality on regular trees. 
2) Compare detection performance of various heuristics on assorted graph structures.

The simulator is available at: [https://dctracing.shinyapps.io/DCTracing/](https://dctracing.shinyapps.io/DCTracing/).


## Methodology

All simulations are based on following procedures:

*Infection Spreading*: This process models the spread of an infectious disease. Given an underlying graph, an initial infected individual (or node), and a specified infection graph size, an infection graph is created in this process. Throughout the process, an 'infection frontier' is sustained. During each iteration, one node in this frontier is randomly chosen to become infected. Consequently, all its replicates are eliminated from the frontier, while all its yet-to-be infected neighbors join the frontier. The process starts with the selection of a node from the underlying graph as the infection's origin and proceeds until the infection graph achieves the prescribed size.

Input: underlying graph - $G$, epidemic source - $v^*$, infection graph size - $N$<br>
Output: infection graph - $G_N$

*Source Inferring*: With an infection graph at hand, the goal of source inferring is to deduce the initial infection source. Every node in the infection graph is assigned a score. The node boasting the highest score is inferred to be the source of the infection. In cases where there are ties, a random selection is made. Various estimators/heuristics have their unique ways to compute the scores.

Input: infection graph - $G_N$, underlying graph - $G$<br>
Output: estimated infection source - $\hat{v}$

## Simulations

:::{iframe} https://dctracing.shinyapps.io/DCTracing/
:width: 100%
:height: 100%
:::

## Challenges

1. Due to the limited computational resources, we can only compute the epidemic centrality for an infection graph with a maximum of 100 nodes.
2. To handle general graphs that contain cycles, building BFS from every node in the infection graph requires $O(N^3)$, where $N$ is the infection graph size. This can be problematic when dealing with a large-scale infection graph.

:::{important}

*Algorithmic Complexity*: The rumor centrality algorithm for general graphs, as presented in [Shah, 2011](https://doi.org/10.1109/TIT.2011.2158885), has a time complexity of $O(N^3)$. This complexity is influenced by the worst-case scenario where a graph has $\dfrac{(N-1)N}{2}$ edges, which is the maximum number of edges for a simple undirected graph with $N$ nodes. However, this $O(N^3)$ time complexity does not mean the algorithm exhaustively evaluates all possible configurations of the graph.

*Scalability in Numerical Simulations*: While the algorithm's time complexity is polynomial, a significant challenge arises in numerical simulations due to the inherent combinatorial nature of the problem. Specifically, considering all permitted permutations results in $N!$ possible configurations. As $N!$ grows extremely fast with increasing $N$, simulations may quickly become computationally infeasible, leading to scalability issues.

This distinction clarifies that while the algorithm operates with polynomial time complexity, the inherent problem size, especially in the context of numerical simulations, can lead to scalability challenges.

:::

### References

1. Hang, C. N., Yu, P. D., Chen, S., Tan, C. W., & Chen, G. (2023). MEGA: Machine Learning-Enhanced Graph Analytics for Infodemic Risk Management. IEEE Journal of Biomedical and Health Informatics.
2. Hang, C. N., Tsai, Y. Z., Yu, P. D., Chen, J., & Tan, C. W. (2023). Privacy-Enhancing Digital Contact Tracing with Machine Learning for Pandemic Response: A Comprehensive Review. Big Data and Cognitive Computing, 7(2), 108.
3. Fei, Z., Ryeznik, Y., Sverdlov, O., Tan, C. W., & Wong, W. K. (2021). An overview of healthcare data analytics with applications to the COVID-19 pandemic. IEEE Transactions on Big Data, 8(6), 1463-1480.