Skip to content

DigitalGeographyLab/Python-Data-Science-Environment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 

Repository files navigation

Python Data Science Environment

The purpose of this page is to help you to install Python and different Python packages into your own computer / server using Anaconda distribution package.

Anaconda Python distribution package

Even though it is possible to install Python from their homepage, we highly recommend using Anaconda which is an open source distribution of the Python programming language for large-scale data processing, predictive analytics, and scientific computing, that aims to simplify package management and deployment. In short, it makes life much easier when installing new tools to your Python.

Installing Mini-conda

The basic Anaconda distribution package comes with large number of different packages. However, in some cases you don't want to install those but to specify yourself exactly what packages should be installed.

For this purpose, there is a mini-conda package distribution available that basically comes with only Python interpreter and the conda package manager that can be used to install Python packages easily. See installation directions for mini-conda on Linux, Windows, or Mac.

Download links for mini-conda installers (version 4.3.21)

You can see the whole archive from here.

Python packages for (geo)data science

Here are packages that are helpful when doing data analysis with Python:

Data analysis & Visualization

  • Numpy --> Fundamental package for scientific computing with Python
  • Pandas --> High-performance, easy-to-use data structures and data analysis tools
  • Python-dateutil --> Powerful extensions to basic datetime functions
  • Pytz --> World TimeZone definitions for working with time-data
  • Scipy --> A collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization and statistics
  • Matplotlib --> Basic plotting library for Python
  • Scikit-learn --> Machine learning library for Python
  • NetworkX --> NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
  • Bokeh --> Interactive visualizations for the web (also maps)
  • Statsmodels --> Statistical methods for Python
  • PySpark --> Python wrapper for Spark
  • Spyder IDE --> Scientific PYthon Development EnviRonment (IDE)

GIS related packages (spatial analysis)

  • Geopandas --> Working with geospatial data in Python made easier, combines the capabilities of pandas and shapely.
  • Shapely --> Python package for manipulation and analysis of planar geometric objects (based on widely deployed GEOS).
  • GDAL --> Fundamental package for processing vector and raster data formats (many modules depend on this).
  • Fiona --> Reading and writing spatial data (required by geopandas).
  • Pyproj --> Performs cartographic transformations and geodetic computations (based on PROJ.4).
  • Pysal --> Library of spatial analysis functions written in Python.
  • Cartopy --> Make drawing maps for data analysis and visualisation as easy as possible.
  • Rtree --> Spatial indexing for Python for quick spatial lookups.
  • Geoplot --> High-level geospatial data visualization library for Python.
  • OSMnx --> Python for street networks. Retrieve, construct, analyze, and visualize street networks from OpenStreetMap

Installation of the packages

After you have installed mini-conda, it is easy to install the Python packages above by running following commands in Terminal.

Versions of the packages as of 29.9.2017 (on Windows). Following installation commands should work similarly with all operating systems (Windows, Mac, Linux)

# Install numpy (v 1.13.1)
conda install numpy

# Install pandas (v 0.20.3) --> bundled with python-dateutil (v 2.6.1) and pytz (v 2017.2)
conda install pandas

# Install scipy (v 0.19.1)
conda install scipy

# Install matplotlib (v 2.0.2) --> bundled with cycler, freetype, icu, jpeg, libpng, pyqt, qt, sip, sqlite, tornado, zlib
conda install matplotlib

# Install scikit-learn (v 0.19.0)
conda install scikit-learn

# Install networkx (v 1.11) --> bundled with decorator (v 4.1.2)
conda install networkx

# Install bokeh (v 0.12.9) --> bundled with jinja2, markupsafe, pyyaml, yaml -packages
conda install bokeh

# Install statsmodels (v 0.8.0) --> bundled with patsy (0.4.1)
conda install statsmodels

# Install PySpark (v 2.2.0) --> bundled with py4j (v 0.10.6)
conda install pyspark

# Install Geopandas (v 0.3.0) --> bundled with click, click-plugins, cligj, curl, descartes, expat, fiona, freexl, gdal, geos, hdf4, hdf5, kealib, krb5, libiconv, libnetcdf, libpq, libspatialindex, libspatialite, libtiff, libxml2, munch, openjpeg, pcre, proj4, psycopg2, pyproj, pysal, rtree, shapely, sqlalchemy, xerces-c
conda install -c conda-forge geopandas

# Install cartopy (v 0.15.1) --> bundled with libxslt, lxml, olefile, owslib, pillow, pyepsg, pyshp 
conda install -c conda-forge cartopy

# Install geoplot (v 0.0.4) using pip (on Linux: be sure to use pip that comes with conda distribution!) --> bundled with seaborn
pip install geoplot

# Install osmnx (v 0.5.4) --> bundled with altair, bleach, branca, colorama, entrypoints, folium, geopy, html5lib, ipykernel, ipython, ipython_genutils, jedi, jsonschema, jupyter_client, jupyter_core, mistune, nbconvert, nbformat, notebook, pandoc, pandocfilters, pickleshare, prompt_toolkit, pygments, pyzmq, simplegeneric, testpath, traitlets, vega, vincent, wcwidth, webencodings
conda install -c conda-forge osmnx

# Install Spyder
conda install -c anaconda spyder 

Test that everything works

You can test that the installations have worked by running following commands in your IPython console (comes with mini-conda).

import numpy as np
import pandas as pd
import geopandas as gpd
import scipy
import shapely
import matplotlib.pyplot as plt
import pysal
import bokeh
import cartopy
import statsmodels
import sklearn
import pyspark
import geoplot
import osmnx

If you don't receive any errors, everything should be working!

About

Installation instructions for GIS-related Python packages using mini-conda.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published