# Purpose of the document

This document supplements the [Introduction to Python](http://nbviewer.jupyter.org/github/djmhunt/Introduction-to-Python/blob/master/Introduction%20to%20Python.ipynb) written for psychology researchers at Goldsmiths. This document is aimed as a resource for those who are already experienced programmers to help them find resources relevant to Python that are outside of the scope of normal psychology research.

As with the Introduction to Python, this document is expected to be a living and dynamic text, updated based on new developments and, more importantly, your feedback, questions and ideas. All will be gratefully received by Dominic Hunt at <psp01dh@gold.ac.uk>.

The document is split into three sections:

* A description of more useful Python packages
* A list of advanced Python programming resources and tutorials
* Example uses of packages

# Python packages

Python is often described as having the kitchen sink built in with the extensive [Standard library](https://docs.python.org/2.7/library/index.html#library-index). However, it does not contain everything nor does it contain necessarily the best implementation of everything. Thankfully, the large community does cover most of the rest. From [XKCD](http://xkcd.com/):

<a href="http://xkcd.com/353/"><img src="http://imgs.xkcd.com/comics/python.png"></a>

Here are some highlights from the Standard Library:

* [File and Directory Access](https://docs.python.org/2/library/filesys.html)
* [Functions creating iterators for efficient looping](https://docs.python.org/2/library/itertools.html)
* [Python object serialization](https://docs.python.org/2/library/pickle.html) and the [faster version](https://docs.python.org/2/library/pickle.html#module-cPickle)
* [Logging facility for Python](https://docs.python.org/2/library/logging.html)
* [A Python 2 to Python 3 converter](https://docs.python.org/2/library/2to3.html)
* [An interfaces for communicating by internet protocols](https://docs.python.org/2/library/internet.html)

## Cloud computing

* [PythonAnywhere](https://www.pythonanywhere.com/) is a service for developing and hosting online Python applications. Potentially could be used for hosting a complex online study or performing web scraping.
* [Amazon Elastic compute cloud](http://aws.amazon.com/ec2/) can be accessed with [StarCluster](http://star.mit.edu/cluster/) and [Boto](http://docs.pythonboto.org/) and used to show [jupyter notebooks](https://notebookcloud.appspot.com/docs).
* The [Google Compute engine](https://developers.google.com/compute/) is written with Python in mind for processor intensive tasks.
* The [Google App Engine](https://developers.google.com/appengine/) can be used to host Python applications. A good example is a [live SymPy terminal](http://live.sympy.org/) or the more fancy SymPy equivalent to [Wolfram Alpha](http://www.wolframalpha.com): [SymPy Gamma](http://www.sympygamma.com).
* [Multyvac](http://www.multyvac.com/) is an open source system for run computationally and/or data intensive workloads. They have Python libraries for submitting jobs directly from Python to their service or your own server.

## GUI programming

The three main cross-platform GUI libraries are [TkInter](https://wiki.python.org/moin/TkInter) (part of the Standard library), [QT](http://qt-project.org/wiki/PySide) and [wxpython](http://www.wxpython.org/) (with the more pythonic wrapper of [wax](http://waxgui.sourceforge.net/)). 

For website development there is [Pyjs](http://pyjs.org/).

For more low level control [SDL](www.libsdl.org) has Python bindings and is a cross-platform development library designed to provide low level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D.

## Parallel processing

It used to be the case that Python could not be parellalised, because of the [Global Interpreter Lock (GIL)](https://wiki.python.org/moin/GlobalInterpreterLock). However, there are now packages to help you parallelise Python code. Some of these are:

* [multiprocessing](https://docs.python.org/2/library/multiprocessing.html), in the standard library, offers both local and remote concurrency by using subprocesses instead of threads to effectively side-stepp the Global Interpreter Lock.
* IPython [has functionality](http://ipython.org/ipython-doc/dev/parallel/index.html) for running multiple different forms of parallel processing. A very basic example can be found in [this IPython notebook](http://nbviewer.ipython.org/gist/jtriley/3866987) 
* [Pathos](trac.mystic.cacr.caltech.edu/project/pathos)  is a framework for heterogenous computing. It primarily provides the communication mechanisms for configuring and launching parallel computations across heterogenous resources. Pathos provides stagers and launchers for parallel and distributed computing, where each launcher contains the syntactic logic to configure and launch jobs in an execution environment. Some examples of included launchers are: a queue-less MPI-based launcher, a ssh-based launcher, and a multiprocessing launcher. Pathos also provides a map-reduce algorithm for each of the available launchers, thus greatly lowering the barrier for users to extend their code to parallel and distributed resources. Pathos provides the ability to interact with batch schedulers and queuing systems, thus allowing large computations to be easily launched on high-performance computing resources. One of the most powerful features of pathos is "tunnel", which enables a user to automatically wrap any distributed service calls within a ssh-tunnel.
* 

## Replacing the command line

It is possible to have the power of Bash scripts in Python, as described in the article [Python for Bash Scripters](http://magazine.redhat.com/2008/02/07/python-for-bash-scripters-a-well-kept-secret/). I would also have a look at the functions provided by the [iPython command line](http://ipython.org/ipython-doc/stable/interactive/tutorial.html)

## Communication

[Twilio](https://www.twilio.com/docs/quickstart/python/sms/sending-via-rest) allows you to send text messages and manage sending and recieving phone calls through their service. It has a free SMS service for low volumes. There are other SMS API's, but this is the most used.

## Speech recognition

[Dragonfly](https://github.com/t4ngo/dragonfly) is a speech recognition framework. It is a Python package which offers a high-level object model and allows its users to easily write scripts, macros, and programs which use speech recognition.

It currently supports the following speech recognition engines: Dragon NaturallySpeaking (DNS), a product of Nuance and Windows Speech Recognition (WSR), included with Microsoft Windows Vista, Windows 7, and freely available for Windows XP.


## Debugging

[Winpdb](http://winpdb.org/download/) is a platform independent GPL Python debugger with support for multiple threads, namespace modification, embedded debugging, encrypted communication and is up to 20 times faster than the [standard Python debugger](https://docs.python.org/2/library/pdb.html) and has a GUI.

Short pure Python code can be tested online with [ideone](ideone.com), which gives you an online editable file where you can run your code for free.

## Documentation

[Sphinx](http://sphinx-doc.org/) is the standard tool for creating documentation documents within Python with good reason. The sytax used by Sphinx is [reStructuredText](http://docutils.sourceforge.net/rst.html), which _attempts to define and implement a markup syntax for use in Python docstrings and other documentation domains, that is readable and simple, yet powerful enough for non-trivial use. The intended purpose of the markup is the conversion of reStructuredText documents into useful structured data formats._ 

## Testing

There are to my knowledge three main testing packages:

* [unittest](https://docs.python.org/2/library/unittest.html) is built in to the Python standard library.
* [Nose](http://nose.readthedocs.org) is written as an extension of unittest.
* [pytest](http://pytest.org) is a testing framework with no boilerplate and simple test discovery that is also capable of running the tests written for unittest and nose.

A good overview of them can be found in the blog [Python testing](http://pythontesting.net).

## Other Python distributions

[Endthought Canopy](https://www.enthought.com/products/canopy/), which runs on Windows, Linux or OSX and is free for academic use. 

[Pyzo](http://www.pyzo.org/) is another distribution that runs on runs on Windows, Linux or OSX and that installs without needing adminstrator priviledges. It is only a Python 3 distribution.

## Other IDEs

* One that is aimed squarely at programmers with complicated code is [JetBrains PyCharm](https://www.jetbrains.com/pycharm/). This is the IDE I prefer for now.
* [Interactive Editor for Python (IEP)](http://www.iep-project.org/) is a cross-platform Python IDE aimed at interactivity and introspection while being light.
* If you enjoy the Visual Studio development environment and have been using the [Goldsmiths Dreamspark copy](http://e5.onthehub.com/WebStore/ProductsByMajorVersionList.aspx?ws=c1079a19-836f-e011-971f-0030487d8897&vsro=8), then you can use it for Python with the addon [Python for Visual Studio](http://pytools.codeplex.com/)

# Resources for learning about Python

To advance your more formal understanding of programming, a resource written using Python examples is [Think Python: How to Think Like a Computer Scientist](http://greenteapress.com/thinkpython/html/index.html).

I have also seen some people recommending [Learn Python the Hard Way](http://learnpythonthehardway.org/book/).

The canonical description of the design philosophy of the Python community can be found The Zen of Python. It can be accessed from within Python with the simple line:

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


# Example uses of packages

In [2]:
#%load_ext version_information
#%reload_ext version_information

%version_information numpy, scipy, matplotlib, pandas

Software,Version
Python,2.7.13 64bit [MSC v.1500 64 bit (AMD64)]
IPython,5.3.0
OS,Windows 10 10.0.15063
numpy,1.12.1
scipy,0.19.0
matplotlib,2.0.1
pandas,0.20.1
Tue May 16 10:10:09 2017 GMT Summer Time,Tue May 16 10:10:09 2017 GMT Summer Time
