# Point Patterns: Part 1 - Generating CSR

In this practical we are going to 
(1) explore how random numbers and random points can be generated.
(2) appreciate the nature of uniform random points and how they are distributed.
(3) measure the centrality and the spatial dispersion of random points.
(4) use Excel to quickly check the outcomes of basic exploratory data analysis.
(5) follow a point pattern practical using PySAL to familiarise ourselves with the process.
[Bonus: Install and use GeoDa.]

#### (1) Try generating a random integer between 0 and N (where N is an integer and N>0)

In [None]:
import os
import numpy as np
import pysal as ps
# import pandas as pd
# import geopandas as gpd
# import seaborn as sns
# from shapely.geometry import Point
# from pysal.contrib.viz import mapping as maps # For maps.plot_choropleth

import random
random.randint(0,N)

# Make sure output is written into notebook
%matplotlib inline

This should give you an integer randomly selected between 0 and N. Note that the range from which a random integer is selected INCLUDES both 0 and N (i.e. the boundaries are included, as in [0…N] ).

Bonus Question 1: What would you change to prevent from either 0 or N being selected?

#### (2) Let us now think of generating a random number between 0.0 and 1.0.

In [None]:
print random.random()


To specify a range other than 0.0 to 1.0, try random.uniform(a,b) where a < b.

Try generating five random numbers in a row: 

In [None]:
for i in range(5):
print ‘%.6f’ % random.random()

Bonus Question 2: How do we know if these are purely random? Or are they merely pseudo-random? If so, how do these programmes decide where to start and what do they do to randomise the values? Would that be good enough for the purpose of (most) applications in geocomputation and spatial analysis?

#### (3) Let us move on to producing random points.
We use the same principle as above except, this time, we must specify both the x and the y values. By default, they will be generated between (0,0) and (1,1), but can be transformed to adapt to any value range.

In [None]:
print ('%.6f' % random.random()),('%.6f' % random.random())

In [None]:
# Let us generate a set of random points.
  for i in range(10):
      print ('%.6f' % random.random()),('%.6f' % random.random())

Question: Think about how these points are distributed. Are they different from one set to another? Why?


#### (4) Generate files containing random points and plot them within a square area of [(0,0), (1,1)].

Bonus Question 3: Try using randn(num) instead and see what happens. It produces a normal or Gaussian distribution of random numbers with a mean of 0 and a standard deviation of 1 (i.e. its histogram should form a bell curve). 
What kind of geographical phenomena or events do you think can be simulated by normal random distribution?

### Task 2: Thinking about the Spatial Patterns of Random Points

(1) Let us think of the mean centre of the random points.
Exercise 1(a): Can you calculate the spatial mean?

Question: If the points are uniformly randomly distributed, between (0,0) and (1,1), where would the spatial mean most likely be found? Why?

(2) Repeat producing a set of random points.
For each set, calculate the spatial mean, and see how they change their position from one set to another.

Question: Is there a way to make the spatial mean more stable?

(3) Just as we do in ordinary statistics, we measure the scatteredness of observations on a single variable about their mean by the standard deviation. In other words, we can measure the degree of spatial dispersion of a point pattern by its standard distance ds. It is the square root of the average square of the distances from every point to the spatial mean: i.e.   where lic is the distance of the ith point from the spatial mean. Try calculating this and produce the circle of influence around the spatial mean.

### Task 3: Using an Excel Spreadsheet

Coding in Python (in an open source environment) is great, as it gives you (1) the complete control and flexibility over what you do and (2) not having to pay the license/package fees of commercial products, especially when it comes to specialist software such as GIS (e.g. ArcGIS). At the same time, it can be also tedious if you simply wanted to quickly process something that is rudimentary. 
Here is an example of what you can do with a standard spreadsheet package:

(1) Using MS-Excel, can you repeat the above (generating random numbers and random points and finding the mean centre, as well as its spatial dispersion)?

Note that Excel uses a set of simple formulae. Random values can be produced with “=rand()”, while formulas like “average([cell 1]:[cell x])” would give you the average, and the cell position can be fixed by designating them with a $ sign; e.g. =+SQRT(([cell x1] - [cell $x$m])^2 + ([cell y1] - [cell $y$m])^2)
where cells [cell $x$m] and [cell $y$m] can respectively hold the x and y values of the mean centre.

The other handy feature is that we can easily copy formula over to the next cell, and drape the same feature/formula along rows or columns.

(2) Can you plot the random points, as well as the mean centre in a chart?
(Note that you should use an xy-scatter plot.)

(3) Question: Can you produce histograms their distribution (one for the x-values and another for y-values)? How are the bins (or the range) determined? If you change the size and number of bins, how would that affect the outcome? What does that tell you?


### Bonus Task: Install GeoDa and Try It Out
GeoDa and PySal are some of the open-source packages for spatial analysis, developed and offered by a group of researchers at Arizona State University, including Prof. Luc Anselin and Prof. Sergio Rey.
From GeoDa project page: “GeoDa is a free, open source, cross-platform software program that serves as an introduction to spatial data analysis. It runs on different versions of Windows (including XP, Vista, 7 and 8), Mac OS, and Linux.” 
You can download GeoDa from here:https://spatial.uchicago.edu/software

If you are installing in the GeoCUP environment, please make sure that you are downloading the package for 32bit Ubuntu Linux version.

(1) Open/Read some of the files (containing a set of random points) you have created earlier. 
(2) Plot these points and find out the basic EDAs.
(3) Produce histograms of the distribution of their x-values, and that of their y-values. Then change the number of bins and see how that affects the outcome.

When you are done, please make sure to save all files on your GeoCUP.
Don’t forget to log out (and take your GeoCUP with you!)

### Take Home Task
Download the dataset on the location of the AirBnb. Plot them on a map. Do you think they are randomly distributed, or are they forming some patterns? How can we find out for sure?


### Explore Point Patterns in PySAL

Go to "Planar Point Patterns in PySAL" by Serge Rey (sjsrey@gmail.com) to explore point patterns in PySAL. (http://nbviewer.jupyter.org/github/sjsrey/giasp16/blob/master/content/pages/notebooks/pointpattern.ipynb)


# Point Patterns: Part 2 - Methods of Point Pattern Analysis in PySAL

The points module in PySAL implements basic methods of point pattern analysis organized into the following groups:

Point Processing
Centrography and Visualization
Quadrat Based Methods
Distance Based Methods

In the remainder of this notebook we shall focus on point processing.

In [None]:
import pysal as ps
import numpy as np
from pysal.contrib.points.pointpattern import PointPattern

### Creating Point Patterns

From lists

We can build a point pattern by using Python lists of coordinate pairs (s0,s1,…,sm)(s0,s1,…,sm) as follows:

In [None]:
points = [[66.22, 32.54], [22.52, 22.39], [31.01, 81.21],
          [9.47, 31.02],  [30.78, 60.10], [75.21, 58.93],
          [79.26,  7.68], [8.23, 39.93],  [98.73, 77.17],
          [89.78, 42.53], [65.19, 92.08], [54.46, 8.48]]
p1 = PointPattern(points)

In [None]:
p1.mbb

array([  8.23,   7.68,  98.73,  92.08])
Thus s0=(66.22,32.54), s11=(54.46,8.48)s0=(66.22,32.54), s11=(54.46,8.48).

In [4]:

In [None]:
p1.summary()

Point Pattern
12 points
Bounding rectangle [(8.23,7.68), (98.73,92.08)]
Area of window: 7638.2
Intensity estimate for window: 0.00157105077112
       x      y
0  66.22  32.54
1  22.52  22.39
2  31.01  81.21
3   9.47  31.02
4  30.78  60.10


In [None]:
type(p1.points)

pandas.core.frame.DataFrame


In [None]:
np.asarray(p1.points)

array([[ 66.22,  32.54],
       [ 22.52,  22.39],
       [ 31.01,  81.21],
       [  9.47,  31.02],
       [ 30.78,  60.1 ],
       [ 75.21,  58.93],
       [ 79.26,   7.68],
       [  8.23,  39.93],
       [ 98.73,  77.17],
       [ 89.78,  42.53],
       [ 65.19,  92.08],
       [ 54.46,   8.48]])

In [None]:
p1.mbb

array([  8.23,   7.68,  98.73,  92.08])

#### From numpy arrays

In [None]:
points = np.asarray(points)
points

array([[ 66.22,  32.54],
       [ 22.52,  22.39],
       [ 31.01,  81.21],
       [  9.47,  31.02],
       [ 30.78,  60.1 ],
       [ 75.21,  58.93],
       [ 79.26,   7.68],
       [  8.23,  39.93],
       [ 98.73,  77.17],
       [ 89.78,  42.53],
       [ 65.19,  92.08],
       [ 54.46,   8.48]])

In [None]:
p1_np = PointPattern(points)
p1_np.summary()

Point Pattern
12 points
Bounding rectangle [(8.23,7.68), (98.73,92.08)]
Area of window: 7638.2
Intensity estimate for window: 0.00157105077112
       x      y
0  66.22  32.54
1  22.52  22.39
2  31.01  81.21
3   9.47  31.02
4  30.78  60.10

#### From shapefiles
This example uses 200 randomly distributed points within the counties of Virginia. Coordinates are for UTM zone 17 N.

In [None]:
f = ps.examples.get_path('vautm17n_points.shp')
fo = ps.open(f)
pp_va = PointPattern(np.asarray([pnt for pnt in fo]))
fo.close()
pp_va.summary()
Point Pattern
200 points
Bounding rectangle [(273959.664381,4049220.90341), (972595.989578,4359604.85978)]
Area of window: 2.16845506675e+11
Intensity estimate for window: 9.22315629531e-10
               x             y
0  865322.486181  4.150317e+06
1  774479.213103  4.258993e+06
2  308048.692232  4.054700e+06
3  670711.529980  4.258864e+06
4  666254.475614  4.256514e+06


* Please refer to http://nbviewer.jupyter.org/github/sjsrey/giasp16/blob/master/content/pages/notebooks/pointpattern.ipynb for further study on the attribute of PySAL Point Patterns.