Why did the function scan_pb_poisson give different results than SatScan? #11

Weiren-Wang · 2021-04-25T07:21:00Z

Hi Benjamin,
I used the scan_pb_poisson to conduct space-time analysis with my dataset and I found that the results given by scan_pb_poisson and the result given by the software SatScan were quite different.
My dataset is a day-frequency disease counts data, range from 2020/12/31 to 2021/4/14.It contains 10 locations with latitude and longitude.
For SatScan,here is the settings:
[Input]
Time precision : Day
Coordinates : Lat/Long
[Analysis]
Type of Analysis : Space-Time
Probability Model : Poissson
Scan For Area With : High rates
And for scan_pb_possion,here is my code :
`counts = SZ_counts %>%
df_to_matrix(time_col = "time", location_col = "region", value_col = "count")
population = SZ_counts %>%
df_to_matrix(time_col = "time", location_col = "region", value_col = "population")
zones = SZ_geo %>%
select(long, lat) %>%
as.matrix %>%
spDists(x = ., y = ., longlat = TRUE) %>%
dist_to_knn(k = 4) %>%
knn_zones
regions = as.character(SZ_geo$region)
result = data.frame()
newcounts = counts
newpopulation = population
poisson_result = scan_pb_poisson(counts = newcounts,
zones = zones,
population = newpopulation,
n_mcsim = 999)
topclusters = top_clusters(poisson_result, zones, k = 10, overlapping = FALSE)

                      top_regions = topclusters$zone %>%
                        purrr::map(get_zone, zones = zones) %>%
                        purrr::map(function(x) regions[x])
                      
                      new_top_regions = c()
                      for (j in 1:length(top_regions)) {
                        new_top_regions[j] = paste(top_regions[[j]], collapse = ',')
                      }
                      
                      topclusters$zonename = new_top_regions
                      topclusters$endtime = rownames(population)[53]
                      result = rbind(result, topclusters)`

    For the same dataset, the SaTScan gave following results:
              1.Location IDs included.: 5
                Coordinates / radius..: (22.726017 N, 114.254455 E) / 0 km
                Time frame............: 2021/2/21 to 2021/4/13
                Population............: 2508600
                Number of cases.......: 198
                Expected cases........: 43.90
                Annual cases / 100000.: 55.4
                Observed / expected...: 4.51
                Relative risk.........: 6.85
                Log likelihood ratio..: 174.120739
                P-value...............: < 0.00000000000000001
              
              2.Location IDs included.: 4, 6, 1
                Coordinates / radius..: (22.754466 N, 113.942560 E) / 22.26 km
                Time frame............: 2021/2/10 to 2021/4/2
                Population............: 6369300
                Number of cases.......: 22
                Expected cases........: 111.46
                Annual cases / 100000.: 2.4
                Observed / expected...: 0.20
                Relative risk.........: 0.16
                Log likelihood ratio..: 63.471627
                P-value...............: < 0.00000000000000001
              
              3.Location IDs included.: 3, 7, 8
                Coordinates / radius..: (22.528466 N, 114.061547 E) / 12.88 km
                Time frame............: 2021/1/1 to 2021/2/20
                Population............: 4265300
                Number of cases.......: 14
                Expected cases........: 73.21
                Annual cases / 100000.: 2.4
                Observed / expected...: 0.19
                Relative risk.........: 0.17
                Log likelihood ratio..: 40.022434
                P-value...............: 0.000000000000092

       While scan_pb_possion gave following results:
     zone duration           score        relrisk_in  relrisk_out          Gumbel_pvalue      zonename          endtime
       15     104         392.4982441   4.248194    0.2996086         0.0000000             5              2021/2/28
        13     104        329.5112571     3.428604   0.2993484         0.0000000             4,5       2021/2/28
     
      The 2 zones given by scan_pb_possion were totally different from the 3 clusters given by SaTScan.Why is that?

      In addition, the SatScan only gave one relative risk but scan_pb_possion give two risk:relrisk_in,relrisk_out.How could I match these results?

The text was updated successfully, but these errors were encountered:

mairamorenoc · 2024-05-15T22:28:50Z

Hi,

In SatScan, are you using the prospective space-time model? Because as far as I understand, the scan_pb_possion function implemented in this package is only for prospective analysis, as he referenced Kulldorff 2001.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why did the function scan_pb_poisson give different results than SatScan? #11

Why did the function scan_pb_poisson give different results than SatScan? #11

Weiren-Wang commented Apr 25, 2021 •

edited

mairamorenoc commented May 15, 2024

Why did the function scan_pb_poisson give different results than SatScan? #11

Why did the function scan_pb_poisson give different results than SatScan? #11

Comments

Weiren-Wang commented Apr 25, 2021 • edited

mairamorenoc commented May 15, 2024

Weiren-Wang commented Apr 25, 2021 •

edited