No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Advanced
data
exercises
images
0.1-How_to_start.ipynb
0.2-Anaconda_Installation.ipynb
0.3-Github_repository.ipynb
0.4-Additional_extensions.ipynb
DSiP-1-Introduction.ipynb
DSiP-2-Python-Programing-Basics.ipynb
DSiP-3-NumPy.ipynb
DSiP-4-Scipy.ipynb
DSiP-5-Matplotlib.ipynb
DSiP-6-Pandas.ipynb
README.md

README.md

python-tutorials

This repository contains

  • installation instructions for a minimal python environment
  • Jupyter notebooks to introduce the basic concepts of python
  • Quizzes to test the understanding of said concepts

Prerequisites

  • access to a computing environment with installation rights
  • programming experience, i.e. familiarity with basic concepts like control flow and data structures

Leitfaden for a possible course

+--Motivation
   Installation Instruction
   Programming Environments
+--Data Types
   Control Flow
   Modules
+--package NumPy
+--package SciPy
+--package scikit-learn
+--package matplotlib (Python plotting, object-oriented)
+--package pandas (Python Data Analysis Library)

Setup

Choosing the proper installation candidate

There are currently two major versions of Python. The older Python2 and the newer Python3. We use the latter, where the latest stable release is 3.6.5 (as of 28 Mar 2018).

Installation Recipe

  1. download the latest (64-bit) Anaconda3-installer from http://continuum.io/download and launch it with

    $ bash Anaconda3-1.9.1-Linux-x86_64.sh
    

You need to agree to the license agreement and may (optionally) specify a target directory (default is ~/anaconda3, my choice is ~/local/share/anaconda3).

  1. the installer then suggests to prepend the path in .bashrc. (You may already have this, e.g. via .profile)

  2. Next, we'll add three channels to the default one (in this order) :

    $ conda config --add channels conda-forge
    $ conda config --add channels defaults
    $ conda config --add channels r
    $ conda config --add channels bioconda
    

    bioconda is for bioinformatics (what's your requirement?) and will receive the highest priority. r is required for bioinformatics and contains moduls for the GNU R programming language. The defaults channel already contains plenty of packages (?TODO list?). Finally, conda-forge contains several community-build packages that are not already in the default channel.

Updating

The installer is for the full package coming with the anaconda meta-package. Let's update it with :

$ conda update anaconda

Modules/Packages

Here's the full package list.

Anaconda is not only the name of the python-distribution, but also the name of its largest meta-package. To maximize compatibility (and minimize maintenance effort), we have the following priorities

  1. packages from the standard library (see below)
  2. packages from the anaconda meta-package (?packages marked as "In Installer" [here](https://docs.anaconda.com/anaconda/packages/py3.6_linux-64/ ?)
  3. packages from conda's default channel, like keras (but not in the default installer's full list of packages?)
  4. ONLY IF NECESSARY packages from selected additional conda channels (r, bioconda)

Minimal Package List

(for reference, when you have to reproduce your environment outside of anaconda, e.g. in SageMath)

favorites from the standard library

  • datetime
  • csv

non-standard packages in conda's default installation

  • interface
    • conda (I)
    • jupyter (I)
  • math
    • numpy (I)
    • scipy (I)
  • data analysis
    • pandas (I)
    • scikit-learn sklearn (I)
  • visualization
    • matplotlib (I)
    • seaborn (I)

non-standard packages in the default channel

  • keras

Update

:

 $ conda update conda
 $ conda update anaconda

this confuses me (and others), so the current recommendation in anaconda's blog1 for "What 95% of People Want" is :

$ conda update --all
$ conda

Optionally :

$ conda remove anaconda

And if things break :

$ conda clean --all

Development

Command-Line Interface

  • (default) python interactive shell
  • MYCHOICE ipython shell

Jupyter Notebook

web-application for interactive python worksheets

Integrated Development Environment

  • MYCOICE spyder, a quick introduction by Joey Bernard
  • pycharm
  • Eclipse with PyDev plugin
  • Emacs with ...

Note: Spyder is already available in the standard installation, but if we want/need more advanced profiling, there's

   $ conda config --add channels spyder-ide
   $ conda install -c spyder-ide spyder-line-profiler
   $ conda install -c spyder-ide spyder-memory-profiler

Packages

are called modules in python

Favorite Non-Standard Packages

NumPy

fast numerical computing, in particular with large arrays and matrices; is part of SciPy, but can also be loaded individually

References:

SciPy

(large) scientific computing library, based on NumPy arrays (and including NumPy)

scikit-learn

machine learning, built to work well with NumPy and SciPy

pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive

matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Here's a short introduction.

keras

Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Further Non-Standard Libraries

qutip

to simulate quantum systems. Very short introduction by Joey Bernard.

Standard Library (Batteries Included)

see the complete list at https://docs.python.org/3/library/

array

This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained.

collections

This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.

csv

import/export of csv-files

itertools

This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python.

math

It provides access to the mathematical functions defined by the C standard.

multiprocessing

multiprocessing is a package that supports spawning processes using an API similar to the threading module.

os

This module provides a portable way of using operating system dependent functionality.

sys

This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.

time

This module provides various time-related functions.

traceback

This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace.

Other Tutorials/Exercises

Books

  • Swaroop, A Byte of Python, CC-BY-SA. (Free pdf and epub download, audience: programming beginners)
  • Allen B. Downey, Think Python, 2nd edition, 2015. (Cave: 1st edition uses python2. Free pdf and html download, sample code available on webpage and github, audience: python beginners with programming experience)
  • Idris2016
  • McKinney2012

(Free) Courses

Quizzes/Exercises/Generic Projects

TODOs

  • check the setup part for anaconda & friends (let's say as number 0)
  • split every notebook into mandatory/optional part
  • for numpy: improve didactical structures. Parts are redundant (e.g. operations), parts dont follow perfect logic order (broadcasting)