# <b>Lab 2: Spatial programming: Vector data</b>
##### WHDM
---

### <a id="1">Contents</a>
[1 Non-spatial data](#1.0) <br>
> [1.1 Reading tables into DataFrames with Pandas](#1.1) <br>
[1.2 Slicing data with Pandas](#1.2) <br>
[1.3 Retrieving unique values](#1.3) <br>
[1.4 Selecting data based on attributes](#1.4) <br>
[1.5 Formatting timestamps](#1.5) <br>
[1.6 Sorting by attribute](#1.6) <br> 
>
[2 Vector data](#2) <br>
>[2.1 Creating geometries using shapely](#2.1) <br>
[2.1.1 Point](#2.1.1)<br>
[2.1.2 MultiPoint](#2.1.2) <br>
-[2.1.2.1 List comprehension](#2.1.2.1)<br>
[2.1.3 Lines](#2.1.3) <br>
[2.1.4 MultiLine](#2.1.4) <br>
[2.1.4.1 The zip() function](#2.1.4.1) <br>
[2.1.5 Polygon](#2.1.5) <br>
[2.1.6 MultiPolygon](#2.1.6) <br>
[2.2 Creating vector data using Geopandas](#2.2) <br>

<br>

__[Markdown guide #1](https://www.datacamp.com/tutorial/markdown-in-jupyter-notebook)__ <br>
__[Markdown guide #2](https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet)__ <br><br>
__[Git/GitHub Guide #1](https://www.freecodecamp.org/news/introduction-to-git-and-github/)__

### <a id="1.0">1.0 Non-spatial data</a>
---
*Non-spatial data refers to data that does not have any inherent spatial or geographic component.* 
- Represents information not directly associated with specific locations or coordinates on the Earth’s surface. 
- Types can include: such as numerical values, text, categorical variables, dates, and more. 
- Focuses on attributes and properties that are not directly tied to specific locations.
- Can be combined and analyzed with spatial data. <br> <br>
[Top](#1)

### <a id="1.1">1.1 Reading tables into DataFrames with Pandas</a>

*<b>What's happening:</b> Setup stuff* <br>

---

*Code explained:*

<blockquote> <b>Line 1:</b>
    Importing pandas library and giving it the nickname pd; that way we do not need to type pandas everytime we use a function from this library <br> 
    <b>Line 2:</b> Using the function read_csv from pandas to read the csv file. We will stored the csv file into a variable called cats_data</blockquote> 

<b>Result explained:</b> DtypeWarning due to mixed data types, within column 6

[Top](#1)


#### 1.1 Code below

In [17]:
#[1]
import pandas as pd
#[2]
cats_data= pd.read_csv("Lab2_data/pet_cats_nz.csv")

  cats_data= pd.read_csv("Lab2_data/pet_cats_nz.csv")


### <a id="1.2">1.2 Slicing data with Pandas</a>

*<b>What's happening:</b> Explore column 6 to understand what's going on. We don't know the name for column 6 yet, therefore we will use the .iloc property to access it.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax for .iloc:</b> DataFrame.iloc[ from row number : to row number , from col number : to col number] <b>or</b> DataFrame.iloc[ row number, col number] <br><br>
    <b>Line 1:</b>
    Note that the row interval is empty, just marked by : , This is an easy way to denote all rows, especially if you do not know the exact number of rows. <br> 
  </blockquote> 

<b>Result explained:</b> The output shows the column name (Name: manually-marked-outlier) and values for that column : False, True and NaN. 

[Top](#1)

#### 1.2 Code below

In [13]:
cats_data.iloc[:,6]

0         False
1         False
2         False
3         False
4         False
          ...  
406467      NaN
406468      NaN
406469     True
406470      NaN
406471      NaN
Name: manually-marked-outlier, Length: 406472, dtype: object

### <a id="1.3">1.3 Retrieving unique values</a>

*<b>What's happening:</b> Finding more information about the data* <br> 

---

*Code explained:*
<blockquote>    
    <b>Line 1:</b> pd.unique() to further check the manually marked outlier column<br> 
    <b>Line 2:</b> Show first 10 lines of cats data<br>
    <b>Line 3:</b> Code to show the name of all cats participating in the study<br>
    <b>Line 4:</b> Confirm the number of cats we have in the dataset<br>
    <b>Line 5:</b> Count unique 'points' values, grouped by team</blockquote> 

<b>Result explained: [line 1]</b> We have mixed data type in that column. True and False are Boolean values and NaN is s a special floating-point value used to represent missing or undefined numerical data. As this will not interfere with our analysis, we can keep going without importing the data again.


*<b>Syntax:</b> pd.unique(), We are accessing data from a specific column by calling the dataframe name followed by the column name between square brackets and quotation marks..*

[Top](#1)

#### 1.3 Code below

In [26]:
#[1]
pd.unique(cats_data['manually-marked-outlier'])
#[2]
cats_data.head(10)
#[3]
pd.unique(cats_data['individual-local-identifier'])
#[4]
cats_data['individual-local-identifier'].nunique()
#[5]
cats_data['individual-local-identifier'].value_counts()

individual-local-identifier
Luna          5151
Whiskey       4381
SkyII         3694
BellaII       3647
Penny         3630
              ... 
Greyskull2     177
Aggie          156
Oscar          144
Barnaby1        98
Boots           11
Name: count, Length: 233, dtype: int64

### <a id="1.4">1.4 Selecting data based on attributes</a>

*<b>What's happening:</b> We are exploring Luna the cat further as an example for how you can filter data.* <br> 

---

*Code explained:*
<blockquote>  
    <b>Syntax:</b> A selection based on attribute on a Pandas DataFrame uses the following syntax:DataFrame['column_name'] comparison_operator (==,>,<,etc) desired_value<br>
    <b>Line 1:</b> Use the structure and syntax explained above to create a dataset with GPS data only for the cat named Luna.(Luna) is a string and MUST be placed between quotation marks. <br> 
    <b>Line 2:</b> Inspect the result by calling the new DataFrame </blockquote> 



<b>Result explained:</b> Our subset dataset for Luna has 5151 records and the same 12 columns as the original data (5151 rows × 12 columns).


[Top](#1)

#### 1.4 Code below

In [32]:
#[1]
cat= cats_data[cats_data['individual-local-identifier'] == "Luna"]
#[2]
cat

Unnamed: 0,event-id,visible,timestamp,location-long,location-lat,algorithm-marked-outlier,manually-marked-outlier,sensor-type,individual-taxon-canonical-name,tag-local-identifier,individual-local-identifier,study-name
123031,1017621486,True,2015-09-19 12:02:09.000,175.050461,-41.128876,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
123032,1017621487,True,2015-09-19 12:05:18.000,175.050262,-41.128883,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
123033,1017621488,False,2015-09-19 12:08:27.000,175.050293,-41.128960,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
123034,1017621489,False,2015-09-19 12:12:06.000,175.050446,-41.128845,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
123035,1017621490,False,2015-09-19 12:15:45.000,175.050583,-41.128906,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
...,...,...,...,...,...,...,...,...,...,...,...,...
128177,1107850120,False,2015-12-11 11:37:27.000,175.050339,-41.128876,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
128178,1107850121,False,2015-12-11 11:45:27.000,175.050156,-41.129082,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
128179,1107850122,False,2015-12-11 11:51:08.000,175.050400,-41.129005,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand
128180,1107850123,False,2015-12-11 11:57:13.000,175.050354,-41.128963,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand


### <a id="1.5">1.5 Formatting timestamps</a>

*<b>What's happening:</b> Sort our new DataFrame based on the column timestamp, so that our GPS points are ordered in time to create a temporal series of locations for Luna.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Line 1:</b> Check the data type for the timestamp column using the dtypes property<br> 
    <b>Line 2:</b> perform the datetime conversion and store the result in a new column named timeline<br>
    *This warning typically occurs when you try to assign a value to a subset of a DataFrame using chained indexing or by directly modifying a slice of the DataFrame. This is not an issue in this case and we can ignore the warning and keep going.*<br>
    <b>Line 3:</b> check the data type for the timeline column using the dtypes property<br>
    <b>Line 4:</b> Call the columns timeline and timestamp from the cat DataFrame and visually compare. Note: we need to use double [] when we are selecting multiple columns at the same time
</blockquote> 

<b>Result explained: line 2 </b>, In Pandas, the dtype(“O”) is often used to indicate that a column or Series contains values that are not of a specific numerical or categorical data type but rather arbitrary Python objects. Therefore we need to first convert the’timestamp’ column from a data type object (dtype(‘O’)) to a datetime data type using the pd.to_datetime() function which requires two arguments: he column to be converted to datetime and the datetime format OR we can let pandas guess the datetime format by not specifying it. That usually works with standard well known datetime formats. <br>
<b>Result explained: line 3</b>, In Pandas, dtype(‘<M8[ns]’) represents a datetime64 data type with nanosecond precision. Means conversion was successful.




[Top](#1)

#### 1.5 Code below

In [38]:
#[1]
cat['timestamp'].dtypes
#[2]
cat['timeline']=pd.to_datetime(cat['timestamp'])
#[3]
cat['timeline'].dtypes
#[4]
cat[['timestamp','timeline']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cat['timeline']=pd.to_datetime(cat['timestamp'])


Unnamed: 0,timestamp,timeline
123031,2015-09-19 12:02:09.000,2015-09-19 12:02:09
123032,2015-09-19 12:05:18.000,2015-09-19 12:05:18
123033,2015-09-19 12:08:27.000,2015-09-19 12:08:27
123034,2015-09-19 12:12:06.000,2015-09-19 12:12:06
123035,2015-09-19 12:15:45.000,2015-09-19 12:15:45
...,...,...
128177,2015-12-11 11:37:27.000,2015-12-11 11:37:27
128178,2015-12-11 11:45:27.000,2015-12-11 11:45:27
128179,2015-12-11 11:51:08.000,2015-12-11 11:51:08
128180,2015-12-11 11:57:13.000,2015-12-11 11:57:13


### <a id="1.6">1.6 Sorting by attribute</a>

*<b>What's happening:</b> Sorting the dataframe by attribute using .sort_values().* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax:</b> blah <br>
    <b>Line 1:</b> Use .sort_values() to organise our original DataFrame from the earliest to the latest datetime in the timeline column: <br> 
    <b>Line 2:</b> Reset the index, Makes the index numbers make sense (they were arbitrary anyway) <br>
    <b>Line 3:</b> Inspecting the result </blockquote> 

<b>Result explained:function of reset_index</b> Is to make the index make sense (logical order)

[Top](#1)

#### 1.6 Code below

In [42]:
#[1]
cat.sort_values(by="timeline", ascending= True, inplace= True)
#[2]
cat.reset_index(inplace= True, drop= True)
#[3] 
cat

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  cat.sort_values(by="timeline", ascending= True, inplace= True)


Unnamed: 0,event-id,visible,timestamp,location-long,location-lat,algorithm-marked-outlier,manually-marked-outlier,sensor-type,individual-taxon-canonical-name,tag-local-identifier,individual-local-identifier,study-name,timeline
0,1017621486,True,2015-09-19 12:02:09.000,175.050461,-41.128876,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:02:09
1,1017621487,True,2015-09-19 12:05:18.000,175.050262,-41.128883,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:05:18
2,1017621488,False,2015-09-19 12:08:27.000,175.050293,-41.128960,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:08:27
3,1017621489,False,2015-09-19 12:12:06.000,175.050446,-41.128845,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:12:06
4,1017621490,False,2015-09-19 12:15:45.000,175.050583,-41.128906,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:15:45
...,...,...,...,...,...,...,...,...,...,...,...,...,...
5146,1107850120,False,2015-12-11 11:37:27.000,175.050339,-41.128876,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-12-11 11:37:27
5147,1107850121,False,2015-12-11 11:45:27.000,175.050156,-41.129082,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-12-11 11:45:27
5148,1107850122,False,2015-12-11 11:51:08.000,175.050400,-41.129005,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-12-11 11:51:08
5149,1107850123,False,2015-12-11 11:57:13.000,175.050354,-41.128963,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-12-11 11:57:13


### <a id="2">2 Vector Data</a>
*<b> Definition:</b> Vector data refers to a type of geospatial data that represents geometric objects using points, lines, and polygons. It is commonly used to represent features such as roads, buildings, and boundaries. Each object in vector data is defined by its geometry (coordinates) and can also have associated attributes such as names or numerical values. Shapely and Geopandas are the two main Python libraries for manipulating vector data.*

### <a id="2.1">2.1 Creating geometries using shapely</a>

*<b>What's happening:</b> Learning how to set up geometries <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax:</b> Shapely is a Python library for geometric operations and analysis. It simplifies complex computations on points, lines, and polygons. <b>from:</b> It is a keyword that signals that you want to import specific elements from a module.
library. <b>module:</b> This represents the name of the module from which you want to import elements. The module can be a built-in module, a standard library module, or a custom module you have created.
<b>import:</b> It is a keyword used to specify that you want to import specific elements from the specified module. <br><br>
    <b>Line 1:</b>
    import the fundamental geometry types implemented as Shapelyobjects: <br> 
    <b>Line 2:</b> select data based on column name and row index. Stores in x the value of the cell defined by the first row (index 0) and the column 'location-long' <br>
    <b>Line 3:</b> Stores in y the value of the cell defined by the first row (index 0) and the column 'location-lat' <br>
    <b>Line 4:</b> Print function to visualize the values of the x and y variables:

</blockquote> 

<b>Result explained:</b> Creating some geometries. creating a point to represent the first GPS location registered for Luna in the cat DataFrame. First we need to select the x (longitude) and y (latitude) values for the first record in the cat DataFrame, which we sorted by timeline. We will use .loc() because it allows us to select data based on column name and row index:

[Top](#1)

#### 2.1 Code below

In [54]:
#[1]
from shapely.geometry import MultiPoint, Point, MultiLineString, LineString, MultiPolygon, Polygon
#[2]
x= cat.loc[0,'location-long']
#[3]
y= cat.loc[0,'location-lat']

In [53]:
#[4]
print(x, y)

175.050461 -41.128876


### <a id="2.1.1">2.1.1 Point</a>

*<b>What's happening:</b> Now we will use the x and y variables to create a Point object and store it in a variable named p1.* <br><br>
*Code explained:*
<blockquote> <b>Line 1:</b> Use the Point function to create your point object<br> 
    <b>Line 2:</b> Visualize the resulting object<br>
     <b>Line 3:</b> Use the Point function to create your point object representing the 2nd location recorded in the cat DataFrame for Luna.<br>
     <b>Line 4:</b> Visualize the resulting object<br>
    <b>Line 5:</b> Use the Point function to create your point object representing the 3rd location recorded in the cat DataFrame for Luna<br>
    <b>Line 6:</b> Visualize the resulting object <br>
    <b>Line 7:</b> Calculate the minimum Euclidian distance between two shapely objects <i>(p1 and p3 of Luna’s trajectory)</i> by using the .distance() function.<br>
    <b>Line 8:</b> Extract x and y coordinates from a point object, using the .x and .y properties:</blockquote> 

<b>Result explained: Line 7 </b> distance is being measured in decimal degrees (currently) because the location-long and location-lat columns, which we used to generate these points, are in decimal degrees.

[Top](#1)

#### 2.1.1 Code below

In [64]:
#[1]
p1=Point(x,y)
#[2]
p1
#[3]
p2=Point(cat.loc[1,'location-long'],cat.loc[1,'location-lat'])
#[4]
p2
#[5]
p3=Point(cat.loc[2,'location-long'],cat.loc[2,'location-lat'])
#[6]
p3
#[7]
p1.distance(p3)

0.00018782971011240446

In [63]:
#[8]
print(p1.x, p1.y) 

175.050461 -41.128876


### <a id="2.1.2">2.1.2 MultiPoint</a>

*<b>What's happening:</b> alternative to List Comprhension. The three points we have created so far (p1,p2 and p3) represent sequential positions for the same cat, Luna. Therefore we could group them into a MultiPoint object:.* <br> 

---

*Code explained:*
<blockquote>    
        <b>Line 1:</b> Use the Point function to create a multipoint object that contains p1,p2 and p3. Note that a list of object [] is passed to MultiPoint() <br> 
    <b>Line 2:</b> Visualize the resulting object<br> 
    <b>Line 3:</b> Show the content of mp_luna using the print() function.<br>
    <b>Line 4:</b> Extract x and y coordinates from each point within a MultPoint object using the .x and .y properties combined with the geoms function. This lione extracts X co-ordinates. </blockquote> 

<b>Result explained:</b> A MultiPoint object is similar to a Point object in Shapely but can contain multiple points instead of just one. It is useful when you want to work with a group or set of points as a single entity. Each point within a MultiPoint is defined by its X and Y coordinates.<b> Line 3:</b> The output doesn't display well because the points are so close they look like they overlap print allows further inspection.

[Top](#1)

#### 2.1.2 Code below

In [66]:
#[1]
mp_luna= MultiPoint([p1,p2,p3])
#[2]
mp_luna
#[3]
print(mp_luna)
#[4]
for p in mp_luna.geoms:
    print(p.x)

MULTIPOINT (175.050461 -41.128876, 175.050262 -41.128883, 175.050293 -41.12896)
175.050461
175.050262
175.050293


### <a id="2.1.2.1">2.1.2.1 List comprehension</a>

*<b>What's happening:</b> alternative to Multipoint. List comprehension is a concise and powerful feature in Python that allows you to create new lists by performing operations on existing iterables, such as lists, tuples, or strings.* <br> 

---

*Code explained:*
<blockquote> <b>Syntax, list comprehension:</b> [expression for item in iterable if condition] <b>expression:</b> It represents the operation or transformation to be applied to each item in the iterable. This expression generates the elements of the new list.
<b>item:</b> It represents the variable that takes each item of the iterable in each iteration.
<b>iterable:</b> It is the existing sequence over which the iteration occurs, such as a list, tuple, or string.
<b>if condition (optional):</b> It is an optional conditional statement that filters the items from the iterable based on a specified condition. Only the items that satisfy the condition are included in the new list.<br><br>
    <b>Line 1:</b> Use list comprehension to extract the x coordinates from each point in mp_luna: Here p.x is our expression, p is our item and mp_luna.geoms is the iterable.<br> 
    <b>Line 2:</b> Extract the y coordinates from each point in mp_luna<br> 
    </blockquote> 


[Top](#1)

#### 2.1.2.1 Code below

In [70]:
#[1]
[p.x for p in mp_luna.geoms] 
#[2]
[p.y for p in mp_luna.geoms]

[-41.128876, -41.128883, -41.12896]

### <a id="2.1.3">2.1.3 Lines</a>

*<b>What's happening:</b> create a LineString representing Luna’s path between p1,p2 and p3:.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Line 1:</b> Note that the points are passed inside a list []. If we were to create the line from coordinates, we would usethe following structure [(x1,y1),(x2,y2),(x3,y3)] <br> 
    <b>Line 2:</b> Note that the LINESTRING is made of pairs of coordinates (x,y) that came from the points (p1,p2,p3) we gave as input.<br>
    <b>Line 3:</b> We can check the distance covered by Luna’s path using the length property
    
 </blockquote> 

<b>Result explained:</b> Note, distance is being measured in decimal degrees because the location-long and location-lat columns are in decimal degrees. can change.

[Top](#1)

#### 2.1.3 Code below

In [78]:
#[1]
path_luna= LineString([p1,p2,p3])
path_luna
#[2]
print(path_luna)
#[3]
path_luna.length

LINESTRING (175.050461 -41.128876, 175.050262 -41.128883, 175.050293 -41.12896)


0.00028212910140471264

### <a id="2.1.4">2.1.4 MultiLine</a>

*<b>What's happening:</b> The MultiLineString object represents a collection of LineString objects. It is used to store and manipulate multiple line segments as a single geometric entity. We can create a MultiLineString object to represent Luna’s paths for two different days for example.To do that we first need to select data for the days we want.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax: pd.unique, </b> follows the following structure: pd.unique( dataframe.timestamp_column.dt.date) <br><br>
    <b>Line 1:</b> First we check the data again to identify which column we can use to select by day <br>
    <i>we want to select by day we need to first obtain all the unique (non repeated) days in that column. We can use .dt.date to extract only the date  for each item in a Pandas data series (a column in a pandas dataframe is a pandas data series).</i>  <br>
    <b>Line 2:</b> Then we use pd.unique to return non repeated days.  <br>
    <b>Line 3:</b> Inspect all results to see the days included in the data <br> 
    <b>Line 4:</b> We use indices <i>denoted [ ]</i> to select a specific day from the variable days, Here we use [0] to select the first day in the list.<br>
    <i>Use attribute selection to create two new dataframes with data only for the first and second days in the dataset.</i> <br>
     <b>Line 5:</b> creating a dataframe for the first day by selecting only rows where timeline.dt.date == days[0] i.e timeline.dt.date == 19/09/2015.<br>
    <b>Line 6:</b> Checking the number of rows found for that day i.e. the numebr of GPS locations collected on that day <br>
    <b>Line 7:</b>  creating a dataframe for the second day by selecting only rows where timeline.dt.date == days[1] i.e timeline.dt.date == 21/09/2015<br>
    <b>Line 8:</b> Checking the number of rows found for that day i.e. the numebr of GPS locations collected on that day
<br>
    <b>Line 9:</b> Check the dataset<br> </blockquote> 

[Top](#1)

#### 2.1.4 Code below

In [84]:
#[1]
cat.head(3)
#[2]
days=pd.unique(cat.timeline.dt.date)
#[3]
days
#[4]
days[0]
#[5]
day_1= cat[cat.timeline.dt.date == days[0]]
#[6]
day_1.size
#[7]
day_2= cat[cat.timeline.dt.date == days[1]]
#[8]
day_2.size
#[9]
day_1.head(3)

Unnamed: 0,event-id,visible,timestamp,location-long,location-lat,algorithm-marked-outlier,manually-marked-outlier,sensor-type,individual-taxon-canonical-name,tag-local-identifier,individual-local-identifier,study-name,timeline
0,1017621486,True,2015-09-19 12:02:09.000,175.050461,-41.128876,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:02:09
1,1017621487,True,2015-09-19 12:05:18.000,175.050262,-41.128883,,False,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:05:18
2,1017621488,False,2015-09-19 12:08:27.000,175.050293,-41.12896,,True,gps,Felis catus,LunaTag,Luna,Pet Cats New Zealand,2015-09-19 12:08:27


### <a id="2.1.4.1">2.1.4.1 The zip() function</a>

*<b>What's happening:</b> We are creating a MultiLineString object representing Luna’s paths for both days. We will use the zip() function to combine elements from multiple iterables* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax: Basic zip(),</b> zip(iterable1, iterable2, ...) <br><br>
    <b>Code block 1:</b> Schematic example of syntax<br> 
    <i>use the zip() function to create individual LineStrings for each day and then create a MultiLineString object for both days and compare those</i> <br>
    <b>Line 2:</b> Creating a LineString for day one. <br>
    <b>Line 3:</b> Now we create the MultiLineStrign object. Note that the zipped coordinates for day 1 and day 2 are passed as a list[] and separated by comma. <br>
    <b>Line 4:</b> Inspect the result<br> 
    <b>Line 5:</b> Check the distance covered by Luna’s paths using the length property <br>
    <b>Line 6:</b> distance covered in each day we need to use list comprehension again with geoms.</blockquote> 


[Top](#1)

#### 2.1.4.1 Code below

In [86]:
#[1]
long = [1, 2, 3]
lat = [5,4,12]

zipped = zip(long, lat)

for pair in zipped:
    print(pair)
    
# Output:
(1, 5)
(2, 4)
(3, 12)

(1, 5)
(2, 4)
(3, 12)


(3, 12)

In [89]:
#[2]
LineString(zip(day_1['location-long'], day_1['location-lat']))
#[3]
path_d1d2= MultiLineString([zip(day_1['location-long'], day_1['location-lat']),zip(day_2['location-long'], day_2['location-lat'])])
#[4]
path_d1d2
#[5]
print(path_d1d2.length)
#[6]
print([x.length for x in path_d1d2.geoms])

0.1630084368947502
[0.08204094490944104, 0.08096749198530914]


### <a id="2.1.5">2.1.5 Polygon</a>

*<b>What's happening:</b>  We will create a polygon to represent the bounding box that encompasses Luna’s trajectories for all days.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax:</b> blah <br>
    <b>Line 1:</b> Find the minimum and maximum x and y coordinates for Luna’s GPS data: x (long) and y (lat) coordinates <br> 
    <b>Line 2:</b> Inspecting these values <br> 
    <b>Line 2:</b> blah <br> 
</blockquote> 

<b>Result explained:</b> blah

[Top](#1)

#### 2.1.5 Code below

In [90]:
#[1]
xmin,ymin,xmax, ymax = cat['location-long'].min(), cat['location-lat'].min(), cat['location-long'].max(), cat['location-lat'].max()
#[2]
print(xmin,ymin,xmax, ymax)

174.786041 -41.132217 175.066498 -41.120335


### <a id="2.1.6">2.1.6 MultiPolygon</a>

*<b>What's happening:</b> blah.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax:</b> blah <br>
    <b>Line 1:</b>
    blah <br> 
    <b>Line 2:</b> blah </blockquote> 

<b>Result explained:</b> blah

[Top](#1)

#### 2.1.6 Code below


### <a id="2.2">2.2 Creating vector data using Geopandas</a>

*<b>What's happening:</b> blah.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Syntax:</b> blah <br>
    <b>Line 1:</b>
    blah <br> 
    <b>Line 2:</b> blah </blockquote> 

<b>Result explained:</b> blah

[Top](#1)

#### 2.2 Code below

In [71]:
### <a id="2.1.3">2.1.3 Lines</a>

*<b>What's happening:</b> blah.* <br> 

---

*Code explained:*
<blockquote>    
    <b>Line 1:</b> blah <br> 
    <b>Line 2:</b> blah </blockquote> 

<b>Result explained:</b> blah

[Top](#1)

#### 2.1.3 Code below

SyntaxError: unterminated string literal (detected at line 3) (3531183547.py, line 3)

### <a id="1.2">blah</a>

*<b>What's happening:</b> blah.* <br> 

---

*Code explained:*
<blockquote> <b>Syntax:</b> blah <br><br>
    
    <b>Line 1:</b> blah <br> 
    <b>Line 2:</b> blah </blockquote> 

<b>Result explained:</b> blah

[Top](#1)

#### 1.5 Code below

# svsv