# Overview of methods

This lists methods in order of complication.  


Simple measures of difference:
*   Total difference
*   Kernel density map difference
*   Aggregated areas/zones map difference
*   Disaggregated areas/zones map difference
*   Simple global summary statistics derived from difference
*   Differences in flow matrices

Aggregate global summary statistics:


Area-by-area statistics:
*   Kappa statistics


Raster statistics:


Multi-scale raster statistics:
*   Multiscale correlation statistics


Spatio-temporal statistics:





# Issues to consider for all methods

NB: Here "regions" are the areas predicted for; "areas" are geographical subregions within a region; "national" is the aggregation of regions.

## Baseline comparison

Some methods are comparative between models (i.e. can say whether a model does better than another).  
Some methods are comparative against some standard baseline (e.g. business as usual model).  
Some methods are against an absolute scale.  
Some methods give a statistical significance (i.e. judged against likelihood of null hypothesis being accidentally dismissed).  

What baselines might we compare to?   
*   Business as usual: using previous data as a prediction of the future - if you can't do better than doing nothing...  
*   Randomised model: throw previous data at a map (either spatio-temporally randomly or weighted to some population at risk) and use this as a future prediction - if you can't do better than random allocation...   
       
## What dimensions are we interested in validating against, and to what precision?

Space  
*   Points in space-time  
*   Density kernel maps  
*   Aggregate areas  
*   More arbitrary areas: for example, hotspots (overlap between real and predicted; intensity). If hotspots are street-based, what does it mean if we predict a nearby street? 

Time  


## Do we need to assess praticality of predictions

It is one thing make a prediction, it is another to make a usable prediction. 



## Relative vs absolute predictions

Are we interested in using absolute predictions or relative predictions? 

Relative include percentages of crimes e.g. local area as percentage of total area crimes, or total for year.  
*   Pros: Good for checking resource allocation within an area; reduces issues of inter-regional variations (e.g. how an area does against the national picture).  
*   Cons: No direct comparison with national picture.  

In practice, for validation, is there much difference?  

Ranges used for percentages etc. can be absolute (from zero to some maximum e.g. UK or city region)   
*   Pros: Good for inter-regional comparison.  
*   Cons: Can give a poor impression in inter-regional comparisons, where local results may be good compared with range.  

or from local minimum to local maximum.   
*   Pros: Gives a more reasonable intra-regional comparison. 
*   Cons: Have to be aware of zeroing minima where there are no zeros in areas. Removes inter-regional comparison option.  

Limits assessment of predictions outside of range.  

In general, absolute figures are generated by model predictions, and can be compared locally with absolute figures, however, the resulting statistics are then biased by the magnitude of the data. To resolve this issue, statistics are often normalised by dividing by the range of the data (for example, in the Normalised Root Mean Square Error), however, which range to take is often unclear. It could be the range of the prediction; real data; combined data; or data from some larger system depending on the data available and the use case. 


## Are we interested in a single scale, multiple scales, or cross-scale?

Single scale statistics are ok for policy or practice driven analysis where the scale is fixed and known.

Multiple scale statistics give the comparative statistic for a range of scales. This helps, in that point-by-point measures at a single scale don't cope well with the situation where a prediction is right, but slightly out geographically. In general (strange effects like negative spatial autocorrelation aside), the correlation between two spatial datasets will increase with scale as datasets overlap more (does weighting for area have an effect? would this be reasonable? It would essentially be an averaging window). Given this, the user has to weigh up the smallest resolution with the most appropriate correlation, looking for the smallest areas that give a good prediction: to some extent we can think of this as the optimum prediction scale for the model. We hope this expresses as a kink in the validation graph, though this may not be optimal as it relates to the spatial distribution of the datasets. If predictions are not on a raster or point basis (for example, they're on a street network), what does scale mean? How to we assess how close a prediction has to be on a street network to count as a good prediction?

Cross-scale statistics unify multiple scales to a single statistic that represents the significance across multiple scales. The different scales can be combined with weightings, or as binary masks, or linearly. 

For more on multi-scale statistics, see 
The classic studies on correlation coefficients varying with scale are…
Robinson, W.S. (1950) ‘Ecological correlations and the behaviour of individuals’ American Sociological Review, 15, 351-357.
http://links.jstor.org/sici?sici=0003-1224%28195006%2915%3A3%3C351%3AECATBO%3E2.0.CO%3B2-R
Gehlke, C.E. and Biehl, H. (1934) ‘Certain effects of grouping upon the size of correlation coefficients in census tract material’ Journal of the American Statistical Association, 29 Supplement, 169-170.
http://links.jstor.org/sici?sici=0162-1459%28193403%2929%3A185%3C169%3ACEOGUT%3E2.0.CO%3B2-M
Costanza, R. (1989) Model goodness of fit: A multiple resolution procedure Ecological Modelling 47, 199-215.
http://www.sciencedirect.com/science/article/pii/030438008990001X 
Malleson, N.S., Heppenstall, A.J., See, L.M. and Evans, A.J (2009) Evaluating an Agent-Based Model of Burglary Working Paper 10/1, School of Geography, University of Leeds. 
http://www.geog.leeds.ac.uk/fileadmin/downloads/school/research/wpapers/10_1.pdf
Practical:
http://www.geog.leeds.ac.uk/courses/other/programming/practicals/general/modelling/validation/multiscale-code/index.html
Code:
https://github.com/MassAtLeeds/ExpandingCell

For an example cross-scale statistic, see GAM:
http://www.ccg.leeds.ac.uk/software/gam/ 
