<a href="https://colab.research.google.com/github/tomersk/learn-python/blob/main/06_01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 6 Spatial Data

## 6.1 Types of spatial data
Raster and vector are the two basic data structures for storing and manipulating images and graphics
data in GIS (Geographic Information Systems). Raster image comes in the form of individual
pixels, and each spatial location or resolution element has a pixel associated where the pixel value
indicates the attribute, such as color, elevation, or an ID number. Vector data comes in the form of
points and lines, that are geometrically and mathematically associated. Points are stored using the
coordinates, for example, a two-dimensional point is stored as ($x$, $y$). Lines are stored as a series of
point pairs, where each pair represents a straight line segment, for example, ($x_1$, $y_1$) and ($x_2$, $y_2$)
indicating a line from ($x_1$, $y_1$) to ($x_2$, $y_2$).
We will create some raster data using some mathematical function, and then also add noise into
it. We will keep both the data (with noise, and without noise) for future use. *np.mgrid* is used to create gridded points. The data is plotted using *plt.matshow* function, which is a simple function to visualize a two dimensional array. Below figures show the data without and with noise. The data without noise shows a systematic behaviour, while it is blurred
in the data added with noise.

In [None]:
import numpy as np
# generate some synthetic data
X, Y = np.mgrid[0:101, 0:101]
data = np.sin((X**2 + Y**2)/25)
data_noisy = data + np.random.random(X.shape)

# plot the data
plt.matshow(data)
plt.colorbar()
plt.title("Synthetic data without noise")
plt.show()

plt.matshow(data_noisy)
plt.colorbar(shrink=0.5)
plt.title("Synthetic data perturbed with noise")
plt.show()

We can also generate a vector data with using some points. Below figure shows the vector data.

In [None]:
# vector data
vector_x = [10,7,24,16,15,10]
vector_y = [10,23,20,14,7,10]

#plot vector data
plt.plot(vector_x, vector_y)
plt.axis((5,25,5,25))
plt.show()

The geospatial data can be classified into two major parts. In the first part we have information about some feature, like a two dimensional array showing the spatial variation in elevation etc. In the second part, we have information about the co-ordinates of the data. A typical processing chain
for geo-spatial data is given in below flow chart. We have the geospatial data, and we extract the feature information and co-ordinate information separately, then we process them separately, and finally after processing we again combine them. The processing for feature information could be some
mathematical operation, and for co-ordinate information, it could some co-ordinate transformation etc.

![Flow chart](https://drive.google.com/uc?export=view&id=1OjEnt4XYIWcemtn7dvtu2TNVIQc_68ND)