![python_cover_page.png](attachment:python_cover_page.png)

I have published a comprehensive tutorial related to [Geospatial Data Science in R](https://zia207.github.io/geospatial-r-github.io/). Due to the high demands for a similar kind of tutorial in Python, I have attempted to duplicate my R-tutorial in Python. I have to create this tutorial for the students who are from different disciplines such as agriculture, soil science, environmental health, environmental engineering, and data science. Most of them have no prior knowledge of GIS, remote sensing, or any other area of geoinformatics. But working with spatial data, it is necessary to know how to process spatial data from different domains and need to familiar with some basic spatial data analysis techniques.

[Python](https://www.python.org/) is an open-source scripting language and uses in different GIS Software packages (such as ArcGIS, QGIS, PostGIS). It is highly efficient for big data analyzing and supports most of the data formats. It is challenging for me to develop a comprehensive Geospatial Analysis tutorial in Python like R since my Python coding skill not as good as R.  


# Geeting Started with Python and  Python Geo-Spatial Libraries

###  Install Python and Python Geo-spatial Libraries

This tutorial has been tested to work on Windows 10 with [Anaconda3](https://www.anaconda.com/) 64 bit, using [conda](https://docs.conda.io/en/latest/). 

First you need to install Python (anaconda python) and necessary python modules that are used to perform various GIS-tasks. 

###  Download Anaconda installer (64 bit) for Windows.

Install [Anaconda](https://docs.anaconda.com/anaconda/install/) to your computer by double clicking the installer and install it into a directory you want (needs admin rights). Install it to all users and use default settings.

### Creating a new environment

Creating a new environment is not strictly necessary, but given that installing other geospatial packages from different channels may cause dependency conflicts (as mentioned in the note above), it can be good practice to install the geospatial stack in a clean environment starting fresh.

The following commands in command prompt create a new environment with the name **PyGeo**, in C drive


In [None]:
# (base) c:\users\zahmed2>conda create --prefix PyGeo python=3.7 

To activate this environment, use

In [None]:
# (base) c:\users\zahmed2>conda activate PyGeo

To deactivate an active environment, use

In [None]:
# (C:\PyGeo) c:\users\zahmed2>conda deactivate

###  Installing Jupyter

It is easy to install Jupyter notebooks with the following command:

In [None]:
conda install -y jupyter

### Install Esstential  Python Packages

[numpy](https://pypi.org/project/numpy/): the fundamental package for array computing with Python

[pandas](https://pypi.org/project/pandas/): powerful data structures for data analysis, time series, and statistics 

[scipy](https://pypi.org/project/scipy/): scientific Library for Python

[matplotlib](https://pypi.org/project/matplotlib/): Python plotting package

[scikit-learn](https://pypi.org/project/scikit-learn/): a set of python modules for machine learning and data mining 

[statsmodels](https://pypi.org/project/statsmodels/):statistical computations and models for Python

[bokeh](https://pypi.org/project/bokeh/): interactive plots and applications in the browser from Python 

[h2o](https://pypi.org/project/h2o/): fast Scalable Machine Learning for Python

[tensorflow](https://pypi.org/project/tensorflow/):an open source machine learning framework 

[XGBoost](https://xgboost.readthedocs.io/en/latest/) 

In [None]:
conda install scipy
conda install -c anaconda scikit-learn
conda install -c anaconda pandas
conda install -c anaconda numpy
conda install -c anaconda statsmodels 
conda install -c anaconda h2o
conda install -c conda-forge pillow

## Geospatial Python Libraries

[**GeoPandas**](http://geopandas.org/) is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and descartes and matplotlib for plotting. 

GeoPandas depends for its spatial functionality on a large geospatial, open source stack of libraries (GEOS, GDAL, PROJ). See the Dependencies section below for more details. Those base C libraries can sometimes be a challenge to install. Therefore, advise you to closely follow the recommendations below to avoid installation problems.
for installtion please see http://geopandas.org/install.html 

**Required dependencies**:

* numpy

* pandas (version 0.23.4 or later)

* shapely (interface to [GEOS](https://trac.osgeo.org/geos))

* fiona (interface to [GDAL](https://gdal.org/))

* pyproj (interface to [PROJ](https://proj.org/))

* [six](https://pypi.org/project/six/)

[**Shapely**](https://pypi.org/project/Shapely/) is a BSD-licensed Python package for manipulation and analysis of planar geometric objects. It is based on the widely deployed GEOS (the engine of PostGIS) and JTS (from which GEOS is ported) libraries. Shapely is not concerned with data formats or coordinate systems, but can be readily integrated with packages that are. 

[**Fiona**](https://pypi.org/project/Fiona/) reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer systems. Fiona contains extension modules that link the **Geospatial Data Abstraction Library (GDAL)**.

[**pyproj**](https://pypi.org/project/pyproj/) Python interface to [**PROJ**](https://proj.org/) (cartographic projections and coordinate transformations library).

[**six**](https://pypi.org/project/six/) is a Python 2 and 3 compatibility library.

**Rtree** Spatial indexing for Python for quick spatial lookups.

In [None]:
conda install --channel conda-forge geopandas

The [**GDAL**](https://gdal.org/)  Python package and extensions are a number of tools for programming and manipulating the GDAL Geospatial Data Abstraction Library. Actually, it is two libraries – GDAL for manipulating geospatial raster data and OGR for manipulating geospatial vector data.

1. from osgeo import gdal
2. from osgeo import ogr
3. from osgeo import osr
4. from osgeo import gdal_array
5. from osgeo import gdalconst

In [None]:
conda install gdal

The [Python Shapefile Library (**PyShp**)](https://pypi.org/project/pyshp/) provides read and write support for the Esri Shapefile format. The Shapefile format is a popular Geographic Information System vector data format created by Esri. You can install PyShp using following comand in your Terminal (Ubuntu)  or cmd (Windows)

To read a shapefile create a new "Reader" object and pass it the name of an existing shapefile. The shapefile format is actually a collection of three files. You specify the base filename of the shapefile or the complete filename of any of the shapefile component files.

In [None]:
conda install -c conda-forge pyshp

[**Earthpy**](https://pypi.org/project/earthpy/) is a set of helper functions to make working with spatial data in open source tools easier. This package is maintained by Earth Lab and was originally designed to support the earth analytics education program.

In [None]:
conda install -c conda-forge earthpy

[**Rasterio**](https://rasterio.readthedocs.io/en/stable/) CAN read and write GeoTIF and other raster formats and provides a Python API based on N-D arrays and and GeoJSON. Installing rasterio from the conda-forge channel can be achieved by adding conda-forge to your channels with:

In [None]:
conda install -c conda-forge rasterio

[**georasters**](https://pypi.org/project/georasters/) package is a python module that provides a fast and flexible tool to work with GIS raster files. It provides the GeoRaster class, which makes working with rasters quite transparent and easy. In a way it tries to do for rasters what GeoPandas does for geometries.

In [None]:
conda install -c conda-forge georasters 

[rasterstats](https://pythonhosted.org/rasterstats/) is a Python module for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal statistics and interpolated point queries. The command-line interface allows for easy interoperability with other GeoJSON tools.

In [None]:
conda install -c conda-forge rasterstats 

[**The mgwr**](https://pypi.org/project/mgwr/)  module provides functionality to calibrate multiscale (M)GWR as well as traditional GWR. It is built upon the sparse generalized linear modeling (spglm) module.

In [None]:
conda install -c conda-forge mgwr

[**PySAL**](https://pypi.org/project/pysal/), the Python spatial analysis library, is an open source cross-platform library for geospatial data science with an emphasis on geospatial vector data written in Python. It supports the development of high level applications for spatial analysis, such as

In [None]:
conda install -c anaconda pysal

[**OSMnx**](https://pypi.org/project/osmnx/) is a package to easily download, model, project, visualize, and analyze complex street networks from OpenStreetMap in Python with NetworkX.

In [None]:
conda install -c conda-forge osmnx

[**Cartopy**](https://scitools.org.uk/cartopy/docs/latest/) is a Python package designed for geospatial data processing in order to produce maps and other geospatial data analyses.Cartopy makes use of the powerful PROJ.4, NumPy and Shapely libraries and includes a programmatic interface built on top of Matplotlib for the creation of publication quality maps.

In [None]:
conda install cartopy

[**OWSLib**](https://geopython.github.io/OWSLib/) is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.

OWSLib was buried down inside PCL, but has been brought out as a separate project in r481.

In [None]:
conda install -c conda-forge owslib

[**mapclassify**](https://pypi.org/project/mapclassify/) is mapclassify is an open-source python library for Choropleth map classification. It is part of a refactoring of PySAL.

In [None]:
conda install -c conda-forge mapclassify

[**descartes**](https://pypi.org/project/descartes/) Use geometric objects as matplotlib paths and patches

In [None]:
conda install -c conda-forge descartes

[**pyseds** ](https://pypi.org/project/pysheds/) is simple and fast watershed delineation in python

In [None]:
conda install -c conda-forge pysheds 

[**RichDEM**](https://richdem.readthedocs.io/en/latest/) — High-Performance Terrain Analysis

In [None]:
conda install -c giswqs richdem

[**geoplot**](https://residentmario.github.io/geoplot/index.html) is a high-level Python geospatial plotting library. It’s an extension to cartopy and matplotlib which makes mapping easy. 

In [None]:
conda install -c conda-forge geoplot

[**folium**](https://python-visualization.github.io/folium/) builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in a Leaflet map via folium.

In [None]:
conda install -c conda-forge folium 

[**RSGISLib**](https://www.rsgislib.org/index.html#python-documentation)

The recommended way to install RSGISlib locally is from conda-forge using the following commands on MacOS and Linux:The [Remote Sensing and GIS Software Library (RSGISLib)](https://www.rsgislib.org/index.html#python-documentation) is a collection of tools for processing remote sensing and GIS datasets. The tools are accessed using Python bindings or an XML interface.

The binary downloads available for Windows, Linux and MacOD, built using Python 3, through conda. You can get conda through the Anaconda or Miniconda Python distributio. The recommended way to install RSGISlib locally is from conda-forge using the following commands on MacOS and Linux:

In [None]:
conda create -n osgeo-env-v1 python=3.7
source activate osgeo-env-v1
conda install -c conda-forge rsgislib

In [None]:
python -m ipykernel install --user --name osgeo-env-v1 --display-name "Python 3.7 (OSGEO-ENV)"

[**GeostatsPy**](https://pypi.org/project/geostatspy/0.0.2/) ncludes functions that run 2D workflows in GSLIB from Python (i.e. low tech wrappers), Python translations and in some cases reimplementations of GSLIB methods, along with utilities to move between GSLIB's Geo-EAS data sets and DataFrames, and grids and 2D Numpy arrays respectively and other useful operations such as resampling from regular datasets and rescaling distributions. Here's a sumary list of functions avaible.

For installtion, you have to download . whl file [(geostatspy-0.0.2-py3-none-any.whl (20.3 kB)](https://pypi.org/project/geostatspy/0.0.2/#files) for your OS and install it in your system. 

In [None]:
pip install geostatspy==0.0.2

[**XGBoost**](https://xgboost.readthedocs.io/en/latest/) is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

In [None]:
conda install py-xgboost

[**TensorFlow**](https://www.tensorflow.org/) is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

The Keras library is a high-level API that runs on top of TensorFlow for building deep learning models. Often, building a very complex deep learning network with Keras can be achieved with only a few lines of code. For installtion of TensorFlow GUP in Python 3.7, please see Jeff Heaton very usefull github site. I follow mostly his class website for regression analysis with Keras. 

In [None]:
conda install tensorflow or conda install tensorflow-gpu

[**H2O**](https://www.h2o.ai/products/h2o/?gclid=CjwKCAiA98TxBRBtEiwAVRLquyjW2meukF2Nwd-bwgO37G3XLGFLjQ3p66qo-pKnn2L0uI6yuMc3SxoCvYoQAvD_BwE)  is a fully open source, distributed in-memory machine learning platform with linear scalability. H2O supports the most widely used statistical & machine learning algorithms including gradient boosted machines, generalized linear models, deep learning and more. 

In [None]:
conda install -c h2oai h2o

[**mlxtend**](http://rasbt.github.io/mlxtend/) is a library of Python tools and extensions for data science.

In [None]:
conda install mlxtend --channel conda-forge

###  Resources: 

* [Geo Python AutoGIS](https://automating-gis-processes.github.io/2017/course-info/Installing_Anacondas_GIS.html)

* [Earth Data Science Tutorials in Python](https://www.earthdatascience.org/workshops/gis-open-source-python/dissolve-polygons-in-python-geopandas-shapely/)

* [Getting started with Geographic Data Science in Python](https://towardsdatascience.com/master-geographic-data-science-with-real-world-projects-exercises-96ac1ad14e63) 

* [Getting map data from OpenStreetMap](https://michelleful.github.io/code-blog//2015/04/27/osm-data/)

* [Mapping in Python with geopandas](http://darribas.org/gds15/content/labs/lab_03.html)

* [Mapping with geopandas](http://jonathansoma.com/lede/foundations-2017/classes/geopandas/mapping-with-geopandas/)

* [Geographic Data with Basemap](https://jakevdp.github.io/PythonDataScienceHandbook/04.13-geographic-data-with-basemap.html) 

* [GeoSpatial Analysis of NYC Taxi data](https://nbviewer.jupyter.org/urls/gist.githubusercontent.com/mrocklin/ba6d3e2376e478c344af7e874e6fcbb1/raw/e0db89644f78f4371ee30fbdd517ce9bd6032a5e/nyc-taxi-geospatial.ipynb)  

* [Lesson 3 Python Geo and Data Science Packages & Jupyter Notebooks](www.e-education.psu.edu/geog489/book/export/html/1734) 

* [The GeoPandas Cookbook](https://www.martinalarcon.org/2018-12-31-d-geopandas/) 

* [Multiscale Geographically Weighted Regression (MGWR)](https://github.com/pysal/mgwr) 

* [GeostatsPy Package](https://github.com/GeostatsGuy/GeostatsPy)

* [WUR Geoscripting](https://geoscripting-wur.github.io/) 

* [Geographic Data Science with PySAL and the pydata stack](http://darribas.org/gds_scipy16/) 

* [Geology and Python](http://geologyandpython.com/dem-processing.html) 

* [pyDEM](https://github.com/creare-com/pydem) 
