### Vector Data

> Vector data represent objects on the Earths surface using their longitude and latitude, as well as combination of the pairs of coordinates(lines, polylines, polygons, etc.)

#### Point Data
    
> A pair of coordinates(longitude, latitude), that represents the location of points on the Earths surface.

> Example : Location of drop boxes, landmarks, etc.

#### Lines 

> A series of points that represents a line(straight or otherwise) on the Eaths surface.

> Example : Center of roads , rivers, etc.

#### Polygons

> A series of points(vertices) that defines the outer edge of a region. 

> Example : Outlines of cities, countries, continents, etc.

#### Credits :

> Google Earth : https://developers.google.com/earth-engine/tutorials/community/bignners-cookbook

> GeoPandas Tutorial : https://http://geopandas.org/


### Introduction to Pandas

#### Learning Objectives :

>> Gain as introduction to the DataFrame and Series data structures of the pandas library

>> Access and manipulate data within a DataFrame ans Series.

>> Import CSV data into a pandas DataFrame


> Pandas is a open source. BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the python programming language.

> pandas is a column-oriented data analysis python library. It's a great tool for handling and analyzing input data.


### Pandas Basics 

The primary data structures in pandas are implemented as two classes :

>DataFrame, which you can imagine as a relational data table, with rows and named columns.

> Series, which is a single column. A DataFrame contains one or more Series and a name for each Series.

The data Frame is a commonly used for data manipulation.

In [None]:
import pandas as pd
import os
os.chdir(r'directory name')

>One way to create a Series is to construct a Series object by passing a list of values. For example :

In [None]:
pd. Series(['Dehradun', 'Almora', 'Pithoragraph'])

> DataFrame objects can be created by passing a dict mapping string column names to their respective series.

In [None]:
city_names=pd.Series(['Dehradun','Almora','Pithoragraph'])
population=pd.Series([10000, 200000, 3000000])

In [None]:
population

In [None]:
pd.DataFrame({'City Name': city_names, 'population' : population})

### Reading csv file 

In [None]:
file_name='stations.csv'

In [None]:
df=pd.read_csv('stations.csv')

In [None]:
df

> The example above used DataFrame.describe to show intresting statistics about a DataFrame. Another useful function is DataFrama.head, and DataFrame.tail which displays the first few records of a DataFrame :

In [None]:
df.describe()

In [None]:
df.head()

In [None]:
df.tail(10)

### Accessing Data

You can access DataFrame data using familiar python dict/list operations :

In [None]:
df['STATION ID']

In [None]:
df['STATION ID'][2]

In [None]:
df[df['STATION']=='HARIDWAR']

### Manipulating Data

> You may apply python's basic arithmetic operations to series . For example :

In [None]:
df['Lat in Min']=df['Latitude']*60

In [None]:
df

### Indexes

>Both series and DataFrame objects also define an index properly that assign an identifier value to each Series item or DataFrame row.

>By default, at construction, pandas assigns index values that reflect the ordering of the source data. Once created , the index values are stable; that is, they do not change when data is recorded.

In [None]:
df['STATION ID'].index

In [None]:
df['STATION ID'].reindex([2,0,1])

> DataFrame.sort_values to sort the rows by one or more columns. For example 

In [None]:
df.sort_values(by=['Latitude'])

### Introduction to Geo Pandas

#### Learning Objectives :

>Gain an introduction to the GeoDataFrame and GeoSeries data structures of the geopandas library 

>Access and manipulate data within a GeoDataFrame and GeoSeries

>Reading vector data

> Manipulating vector data

> Processing vector data

> Writing vector data


### GeoPandas Basics 

> Geopandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types.

> The primary data structures in geopandas are implemented as two classes :

>> GeoSeries, is essentially a vector where each entry in the vector is a set of shapes corresponding to one observartion. Geopandas has three basic classes of geometric objects :

>>>Points / Multi-Points
>>>Lines / Multi-Lines
>>>polygons / Multi-polygons

> GeoDataFrame, is a tabular data structure that contains a GeoSeries. The most important properly of a GeoDataFrame is that it always has one GeoSeries column that holds a special  status. This GeoSeries is referred to as the GeoDataFrame's "geometry"


### Reading and Writing Spatial Data

### Vector Data Formats 

#####  ESRI Shape File :
is simple, nontopological format for storing the geometric location and attribute information of geographic feature. the shapefile format defines the geometry and attributes of geographically referenced features in three or more files with specific fiel extensions : 

>.shp- The main file that stores the feature geometry, required.

>.shx- The index file that stores the index of the feature geometry, required

>.dbf - The dBASE table that stores the attribute  information of features, required

>.sbn and .sbx - The files that store the spatial index of the features

>.prj - The file that stores the coordinate system information


#### GeoJSON :

GeoJSON is a format encoding a variety of geographic data structures. A GeoJSON object may represent a region of space(a Geometr), a spatially bounded entity (a Feature), or a list of Features(a FeatureCollection). GeoJSON supports the following geometry types : Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, and GeometryCollection. Features in GeoJSON contain a Geometry object and additional properties and a FeatureCollection contains a list of Features.


{"type":"Feature",
"geometry"{
    
    "type":"point",
    "coordinates" : [78, 30]
}

"propertise":{"name":"Dehradun"}

}


>Geography Markup Language

>Keyhole Markup Language

>MongoDB

>Microsoft SQL Server Spatial Database

>OGS WFS service

>etc.


### Reading Shape File

In [None]:
import geopandas as gpd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from geopandas import GeoSeries

In [None]:
vector_file=''

In [None]:
india=gpd.read_file(vector_file)

In [None]:
india

In [None]:
 india.geometry.name

In [None]:
india.crs

>Renaming a column

In [None]:
india=india.rename(columns={'geometry':'borders'}).set_geometry('borders')

In [None]:
india.head()

In [None]:
india.plot(figsize=(18,14))

### Reading Shape File(.zip)

In [None]:
vector_file= r'zip://zip file name.zip'

In [None]:
india = gdp.read_file(vector_file)

In [None]:
india.head()

In [None]:
india.plot(figsize=(18,14))

In [None]:
plt.close()

### Subsetting GeoDataFrame File 

In [None]:
subset = india[india['Name_1']=='Karnataka']

In [None]:
subset.head()

In [None]:
subset.plot(figsize=(10,10))

In [None]:
plt.close()

### Reading subsets of the data

>Geometry Filter

In [None]:
bbox={77.57,28.72, 81.84, 31.5}

In [None]:
uk=gpd.read_file(vector_file, bbox=bbox)

In [None]:
uk

In [None]:
uk.plot()

In [None]:
plt.close()

### Reading GeoJSON File

In [None]:
gson='tehsil_almora.geojson'

In [None]:
almora=gpd.read_file(gson)

In [None]:
almora.head()

In [None]:
almora.plot()

In [None]:
ax=almora.plot()
almora.apply(lambda x:ax.annotate(x.Tehsil,xy=x.geometry.centroid.coords[0],ha='center'),axis=1)
#plt.close()

In [None]:
plt.close

### Writing Shape File

In [None]:
almora

In [None]:
population = pd.Series(np.linspace(5000,20000,8))

In [None]:
population

In [None]:
almora['Population']=poplulation

In [None]:
almora.tail()

In [None]:
almora.to_file('almora.shp')

### Writing GeoJSON file 

In [None]:
almora.to_file('almora.geojson',drivers='GeoJSON')

### Mapping with GeoPandas

In [None]:
vector_file='tehsil.shp'

>Geopandas provides a high-level interface to the matplotlib library for making maps. Mapping shapes is as easy as using the plot() method on a GeoSeries or GeoDataFrame.

In [None]:
uttarakhand=gdp.read_file(vector_file)

In [None]:
uttarakhand.head()

In [None]:
uttarakhand.plot(color='green')

In [None]:
population = pd.Series(np.random.randint(10000, 90000, len(uttarakhand)))

In [None]:
uttarakhand['Population']=population

### Choropleth Maps

> geopanda makes it easy to create Choropleth maps (maps where the color of each shape is based on the value of an associated variable)

In [None]:
uttarakhand.plot(column='Population')

### Creating a legend

>When plotting a map, one can enable a legend using the legend argument

In [None]:
uttarakhand.plot(column='Population',legend=True)

### Changing a legend orientation

In [None]:
uttarakhand.plot(column='Population',legend=True,legend_kwds={'label':'Population by Tehsil', 'orientation':'horizontal'}, figsize=(5,5))

### Changing colors

>colors can be modified by plot with the cmap option

In [None]:
uttarkhand.plot(column='Population',cmap='Accent',legend=True)

> The way color maps are scaled can also be manipulated with the schema option. The schema option can be set to any schema provided by mapclassify(e.g. 'box_plot', 'equal_interval', 'fisher_jenks', 'fisher_jenks_sampled', 'headtail_breaks', 'jenks_caspall','jenks_caspall_forced', 'jenks_caspall_sampled, 'max_p_classifier', 'maximum_breaks', 'natural_breaks', 'quantities', percentiles', 'std_mean' or 'user_defined')

In [None]:
uttarakhand.plot(column='Population', cmap='Accent', legends=True, schema='maximum_breaks', figsize=(10, 10))

In [None]:
del uttarakhand