<a href="https://colab.research.google.com/github/Avipsa1/UPPP275-Notebooks/blob/main/Module3a_vector_data_structure.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 # Spatial Data Structure - Vector
 
 This lecture will introduce two ways to create and use vector data models.
 
**Part I**: Code it yourself (this class).
 
**Part II**: Code it with libraries


## Spatial Data Structure - Vector

About Vector Data
Vector data are composed of discrete geometric locations (x, y values) known as vertices that define the “shape” of the spatial object. The organization of the vertices determines the type of vector that you are working with. There are three types of vector data:

**Points**: Each individual point is defined by a single x, y coordinate. There can be many points in a vector point file. Examples of point data include: sampling locations, the location of individual trees or the location of plots.

**Lines**: Lines are composed of many (at least 2) vertices, or points, that are connected. For instance, a road or a stream may be represented by a line. This line is composed of a series of segments, each “bend” in the road or stream represents a vertex that has defined x, y location.

**Polygons**: A polygon consists of 3 or more vertices that are connected and “closed”. Thus the outlines of plot boundaries, lakes, oceans, and states or countries are often represented by polygons. Occasionally, a polygon can have a hole in the middle of it (like a doughnut), this is something to be aware of but not an issue you will deal with in this tutorial.

![alt text](http://www.public.asu.edu/~wenwenl1/gis322o/images/vector.png)

## Define a Point

In [None]:
# define a point by its x and y coordinates 
x1, y1 = 0, 0

# define a second point also by its x and y coordinates
x2, y2 = 3, 4

## Distance

A typical spatial operation for point data is to calculate the distance between two points.

Below you can find the formula for calculating the distance

![alt text](http://www.public.asu.edu/~wenwenl1/gis322o/images/distance.png)

To calculate the distance between two points, we need to take the square root of the summed squared distance between x and y coordinates.

The built-in **math** module can help us with taking the square root. See example below

In [None]:
import math
print(math.sqrt(4))
print(math.sqrt(9))

# Note: don't forget to import the module before using it!

2.0
3.0


In [None]:
# Now let's define a distance() function which will take input of the two points, and output (return value) the distance

import math

def distance(x1, y1, x2, y2): # these are four input values of two points (x1,y1) and (x2,y2)
  d = math.sqrt((x1-x2)**2 + (y1-y2)**2)
  return d # set the return value
 
# define two points, you may name the variable at any name
a1, a2 = 0, 0 # point1
b1, b2 = 3, 4 # point2

# The distance() fucntion will not run until you call it
# Since this function has a return value, you can save it to a variable named dis here
dis = distance(a1,a2,b1,b2)
print("The distance between p1 and p2 is: ", dis)



The distance between p1 and p2 is:  5.0


In [None]:
# There are several ways to define a distance function. 
# In the previous definition, we 'spelled out' all the coordinates of the two points by having four parameters in the function definition
# We can structure the coordinates of a point in a list, then to calculate the distance between two points, we only need two parameters in the function definition.
# For instance, assume we have point1 which has the data structure like [x1,y1] and point2 which has the data structure of [x2,y2]
# We can define the distance() function in the below way

import math
def distance(p1,p2):
  # now here, remember, p1 and p2 are two lists, 
  # we should access x1, y1 of p1 using p1[0], p1[1]
  # and .            x2, y2 of p2 using p2[0], p2[1]
  d = math.sqrt((p1[0]-p2[0])**2 + (p1[1]-p2[1])**2)
  return d # set the return value
 
#now define the two points 
p1 = [0,0]
p2 = [3,4]
#call the function as
dis = distance(p1,p2)
print("The distance between p1 and p2 is: ", dis)

The distance between p1 and p2 is:  5.0


## Polyline

Lines are composed of many (at least 2) vertices, or points, that are connected. For instance, a road or a stream may be represented by a line. This line is composed of a series of segments, each “bend” in the road or stream represents a vertex that has defined x, y location.



In [None]:
# In the above example, we deine a point as a list containing x and y coordinates
# To define a polyline, we will need a data structure to define a list of points

# Option 1: define it as a list of list.
p1 = [1,2]
p2 = [3,5]
p3 = [4, 10]

poly1 = [p1,p2,p3]
print(poly1)

#This is the same as defining poly1 as:
poly1 = [[1,2], [3,5], [4, 10]]
print(poly1)

#Or you may define it as a list of two lists, a list of all x coordinates, and a list of all y coordinates of the points
poly1 = [[p1[0],p2[0],p3[0]],[p1[1],p2[1],p3[1]]]
print(poly1)

[[1, 2], [3, 5], [4, 10]]
[[1, 2], [3, 5], [4, 10]]
[[1, 3, 4], [2, 5, 10]]


In [None]:
#When you define a polyline in different data structure, you will need to refer to it differently to get the correct value


## Calculate the length of a polyline

The length of the polyline equals to the sum of each line segment.

Say a polyline as three points, p1, p2, p3

Then the length of the polyline equals to the sum of two line segments:

<p1,p2> and <p2,p3>

Length of a segment can be calculated as the straightline distance between the two points.

Following this simple example, if a polyline has n point: p1,p2...pn

Then the length of the polyline equals to the sum of n-1 line segments:

<p1,p2>, <p2,p3>, <p3,p4> .. <p(n-1), pn>

So my next question is:

### How to obtain the length of a polyline?

-- We shall use loop to loop through the first n-1 points, because if there are n points, there will be only n-1 segments.

-- Say if the current point as index of i, then its next point will have index of i+1, using these indices, we can get access to the coordinates of p(i) and p(i+1) and then call the distance() function to get the length of the line segment.

-- Finally, we will obtain the length of the polyline by summing up the length of all line segments.

In [None]:
# For instance, if we define poly1 as:
poly1 = [[1,2], [3,5], [4, 10], [12, 20]] # which contains four points

# Let us define a distance function first, which will take two points in the structure of each point as a list
# Basically, we can just copy and paste the distance() we defined earlier

import math
def distance(p1,p2):
  # now here, remember, p1 and p2 are two lists, 
  # we should access x1, y1 of p1 using p1[0], p1[1]
  # and .            x2, y2 of p2 using p2[0], p2[1]
  d = math.sqrt((p1[0]-p1[1])**2 + (p2[0]-p2[1])**2)
  return d # set the return value

#Set the length variable outside the loop, with initial value 0
length = 0
print("Initial length is: ", length)
# Next, let's write the loop 
for i in range(len(poly1)-1): # get each element (point) of the poly1 list, excluding the last point (because there is only n-1 segments, so we used len(poly1)-1 here.
  print("Iteration", i)
  pi = poly1[i] # get ith point
  pi_plus1 = poly1[i+1] # get (i+1)th point
  dis = distance (pi, pi_plus1)
  print("   The distance between p%d and p%d is: %.2f" % (i, i+1, dis)) # %f means to output a floating number, .2 means two decimal digits
  length += dis #add each segment length to the total length
  print("   Polyline length becomes: %.2f (by adding %.2f)" % (length,dis))
  

Initial length is:  0
Iteration 0
   The distance between p0 and p1 is: 2.24
   Polyline length becomes: 2.24 (by adding 2.24)
Iteration 1
   The distance between p1 and p2 is: 6.32
   Polyline length becomes: 8.56 (by adding 6.32)
Iteration 2
   The distance between p2 and p3 is: 10.00
   Polyline length becomes: 18.56 (by adding 10.00)


Okay, now 

### How about defining the length calculation procedure as a function?

See below example

In [None]:
# I first copied the code from the above cell
# See the lines I highlighted below using many ########################
# Then we will get the same results as above

# For instance, if we define poly1 as:
poly1 = [[1,2], [3,5], [4, 10], [12, 20]] # which contains four points

# Let us define a distance function first, which will take two points in the structure of each point as a list
# Basically, we can just copy and paste the distance() we defined earlier

import math
def distance(p1,p2):
  # now here, remember, p1 and p2 are two lists, 
  # we should access x1, y1 of p1 using p1[0], p1[1]
  # and .            x2, y2 of p2 using p2[0], p2[1]
  d = math.sqrt((p1[0]-p1[1])**2 + (p2[0]-p2[1])**2)
  return d # set the return value

                 ##########################################################################################################
def lengthCal(poly1): ################# add this line for converting the procedure to a function so we can reuse it easily
                 ##########################################################################################################
  #Set the length variable outside the loop, with initial value 0
  length = 0
  print("Initial length is: ", length)
  # Next, let's write the loop 
  for i in range(len(poly1)-1): # get each element (point) of the poly1 list, excluding the last point (because there is only n-1 segments, so we used len(poly1)-1 here.
    print("Iteration", i)
    pi = poly1[i] # get ith point
    pi_plus1 = poly1[i+1] # get (i+1)th point
    dis = distance (pi, pi_plus1)
    print("   The distance between p%d and p%d is: %.2f" % (i, i+1, dis)) # %f means to output a floating number, .2 means two decimal digits
    length += dis #add each segment length to the total length
    print("   Polyline length becomes: %.2f (by adding %.2f)" % (length,dis))
    
###############################################################################################################################
lengthCal(poly1) ################# add this line to call the function in order to run the code block inside the function
###############################################################################################################################

Initial length is:  0
Iteration 0
   The distance between p0 and p1 is: 2.24
   Polyline length becomes: 2.24 (by adding 2.24)
Iteration 1
   The distance between p1 and p2 is: 6.32
   Polyline length becomes: 8.56 (by adding 6.32)
Iteration 2
   The distance between p2 and p3 is: 10.00
   Polyline length becomes: 18.56 (by adding 10.00)


## Polygon

**Polygons**: A polygon consists of 3 or more vertices that are connected and “closed”. Thus the outlines of plot boundaries, lakes, oceans, and states or countries are often represented by polygons. Occasionally, a polygon can have a hole in the middle of it (like a doughnut), this is something to be aware of but not an issue you will deal with in this tutorial.

Polygon is also defined as a sequence of points, the only difference is: in order to make it a closed loop, the first point and last point in a polygon should be the same. See below example


In [None]:
poly1 = [[1,2], [3,5], [4, 10], [12, 20], [1,2]] # which contains four points

print(len(poly1)) # but the length is five, because the first point and last point in a polygon should be the same


5


So now, if we are asked "**How to get the perimeter of a polygon?**"

The answer is quite simple. We just need to copy and paste the code we developed above for calculating the length of the polyline to get the perimter of the polygon. The only change to make is to feed the code with different data.

See below example

In [None]:
poly1 = [[1,2], [3,5], [4, 10], [12, 20], [1,2]] # which contains four points


import math
def distance(p1,p2):
  # now here, remember, p1 and p2 are two lists, 
  # we should access x1, y1 of p1 using p1[0], p1[1]
  # and .            x2, y2 of p2 using p2[0], p2[1]
  d = math.sqrt((p1[0]-p1[1])**2 + (p2[0]-p2[1])**2)
  return d # set the return value

#Set the length variable outside the loop, with initial value 0
length = 0
print("Initial length is: ", length)

# Next, let's write the loop 
for i in range(len(poly1)-1): # get each element (point) of the poly1 list, excluding the last point (because there is only n-1 segments, so we used len(poly1)-1 here.
  print("Iteration", i)
  pi = poly1[i] # get ith point
  pi_plus1 = poly1[i+1] # get (i+1)th point
  dis = distance(pi, pi_plus1)
  print("   The distance between p%d and p%d is: %.2f" % (i, i+1, dis)) # %f means to output a floating number, .2 means two decimal digits
  length += dis #add each segment length to the total length
  print("   Length becomes: %.2f (by adding %.2f)\n" % (length,dis))

print("The perimeter length of the polygon is: ", length)

Initial length is:  0
Iteration 0
   The distance between p0 and p1 is: 2.24
   Length becomes: 2.24 (by adding 2.24)

Iteration 1
   The distance between p1 and p2 is: 6.32
   Length becomes: 8.56 (by adding 6.32)

Iteration 2
   The distance between p2 and p3 is: 10.00
   Length becomes: 18.56 (by adding 10.00)

Iteration 3
   The distance between p3 and p4 is: 8.06
   Length becomes: 26.62 (by adding 8.06)

The perimeter length of the polygon is:  26.6228810461351
